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This invention relates to an isolated nucleic acid 
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wherein expression of the chimeric gene results in 
production of altered levels of the carotenoid biosynthetic 
enzyme in a transformed host cell. 



SEQ IO KO:27 OFDXVLFGH -—7 

SSQ ID BO: 28 HAIILVRAAflPGL — SAA03 iaB0-CTtQC8TLIJCTO^AARB»KPC gI.WIit P«ACaP 
8EQ ZD NOt 02 EESXTCaVLOICUXaAYXJMXEVCftC^^ 

SKQ ZD BOlU MSGVLUTV8C SFI0MIK8J>V8raCRa8g00ro-I0KRr8CISgM» — — — 

1 60 



SEQ IO KOl27 
SEQ IO KO:28 
3 CO ID HOi 02 
SSQ ID HOiH 



SCO ID HOI27 
SEQ IO KOl28 
SEQ IO WO: 02 
SCO ID K0114 



6PAVY8SIJ?Vt^AGEAVV8axaKVT0VVIJa^->riTiyRmJffPVL0 JUWQDHDHF — 

OElVVDCWUtf TCTCTAIiDCTEKKLE DTJKSBPTPMTOAfcLSOTVBKFPVDIQP I1CDMVOC 
CrSAT^S--AVAATgr8»33CKItVYEVVI < K0>-ALVKgHJU^TTI I HTfrr.fflnWSADra — 
61 120 



— LGiasunntaxvauETucrmr^TKUirmisiuma^ 



KUAuncsiuarrroeL — YtrcYyv»CTQiKTPCTJuc*vwAi tvwcbrtcelvdcpsxs 

-NVDXJJUUtTDRCGETCAEYAKCTTIr-GT^^ 

121 1»0 



SCO ID 110:2? 
SCO ID WO: 28 
SEQ ID HO! 02 
SSQ ID KO:14 



hitpqxuuucuuuxdi ni&RFraaDAJasorvsRFPVDio^ntatvEG^^ 

TTTTTMiPIT1Prm.grnrrere l » v ™ n ^"-« t ^^ 

181 240 



SEQ ID KO:27 KM rOELYLyCTTVACTVGIJtSVPIHGIAFRaKKTTESVTltXALALCIAKOI^B imPVGS 

SCO IO M0i28 HHrt^YMYCYYVAr^VO^VPVKCI ATRSKATTKSVTaAALM^IfcKQLTW 

SBQ 10 HO102 KTrOSI.YI.TCTWJU^VGUtrtfgVHGIA^ 

SSQ ID NOiH nmtELYLTCnVJUnVCUISVFmGUVSSIlAflSUI^^ 

211 500 

****** *** ******** *** * ** ** *** ** *♦ *** ** * 

SEQ ZO HOl27 f3A»tGinnri«PQD£IAO)^^ 

SCQ ID HO: 20 CM|tGIUTLP(l€EUOM^ 

SEQ ID 110:02 OMt&GMTLPlVOeLAOftCI»TUDXFra 

SEQ 10 110:14 l^U«lCa^nfU^DElJ«>Wa^ DDOI rwaVTOWrWOTCKOQI KB*BMrrtCJk£RCV*n^3 
301 * w 



SCO Z0 OO: 27 
SCO 10 VO;20 
SCO ID ROt02 
SCO IO MOil4 



AS&VPVUSLUYHmi«D£XW 

xsRMpvvuuxTi^iuciCJUfimasi 

ASSffFVLASUrtYAOI LDJUEJUIU I KH f TW&WKSKAXKXlJ&LPlAXttMStMr « 
ASRVrVMMIXLtSLQXUS ICJUraTWIlTI^ 

Ml «° 



SCQ 10 
SCQ IO 
SCO ID 
SCQ 10 



110:27 
KO: 28 
110:02 
BO: 14 



L--JUCT 
LRKCQT 

VR — a. 
421 426 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


Prance 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NB 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


a 


Cote d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






CZ 


Czech Republic 


IX 


Saint Luda 


RU 


Russian Federation 






DE 


Germany 


U 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







WO 99/55889 PCT/US99/08789 

TITLE 

CAROTENOID BIOSYNTHESIS ENZYMES 
This application claims the benefit of U.S. Provisional Application No. 60/083,042, 
filed April 24, 1998. 
5 FIELD OF THE INVENTION 

This invention is in the field of plant molecular biology. More specifically, this 
invention pertains to nucleic acid fragments encoding enzymes of the carotenoid 
biosynthesis pathway in plants and seeds. 

BACKGROUND OF THE INVENTION 

10 Plant carotenoids are orange and red lipid-soluble pigments found embedded in the 

membranes of chloroplasts and chromoplasts. In leaves and immature fruits the color is 
masked by chlorophyll but in later stages of development these pigments contribute to the 
bright color of flowers and fruits. Carotenoids protect against photoxidation processes and 
harvest light for photosynthesis. The carotenoid biosynthesis pathway leads to the 

15 production of abscisic acid with intermediaries useful in the agricultural and food industries 
as well as products thought to be involved in cancer prevention. (Bartley, G. E., and 
Scoinik, P. A. (1995) Plant Cell 7:1027-1038). 

Phytoene synthase carries out the first step in the carotenoid biosynthetic pathway 
converting geranylgeranyl diphosphate to phytoene. There are two different phytoene 

20 synthases in tomato with different expression patterns: one is expressed at higher levels in 
mature fruits while the other one is expressed at higher levels in leaves (Bartley, G. E., 
Scoinik, P.A. (1993) J. Biol Chem. 265:25718-25721). It has been speculated that in corn at 
least two different alleles of phytoene synthase should be present but only one has been 
identified to date (Buckner, B. et al. (1996) Genetics 745:479-488). 

25 In the next step of the carotenoid biosynthesis pathway, phytoene desaturase 

transforms phytoene into phytofluene. After another desaturation step, the enzyme zeta- 
carotene desaturase (carotene 7, 8 desaturase; EC 1.134.99.30) converts the lightly colored 
zeta-carotene to neurosporene which is further desaturated into lycopene. Lycopene may 
have one of two different fates: through the action of lycopene epsilon cyclase it may 

30 become alpha carotene, or it may be transformed into beta carotene by lycopene cyclase. 
Beta-carotene dehydroxylase converts beta-carotene into zeaxanthin. Zeaxanthin epoxidase 
transforms zeaxanthin into violxanthin and eventually abscisic acid. The genes encoding 
this chloroplast-imported protein have been identified in K plumbaginifolia, pepper and 
tomato. Zeaxanthin epoxidase appears to also be involved in protection from environmental 

35 stress (Corinne A. et al. ( 1 998) Plant Phys, 1 18: 1 02 1 - 1 028) and uses FAD as a cofactor 
(Buch, K. etal. (1995) FEBS Lett, 575:45-48). 

Zeaxanthin is the bright orange product highly prized as a pigmenting agent for 
animal feed which makes the meat fat, skin, and egg yolks a dark yellow (Scott, M. L. et al. 
(1968) Poultry ScL 47:863-872). Gram per gram, zeaxanthin is one of the best pigmenting 
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compounds because it is highly absorbable. Yellow corn, which produces one of the best 
ratios of lutein to zeaxanthin contains in average 20 to 25 mg of xanthophyll per kg while 
marigold petals yield 6,000 to 10,000 mg/kg. 

SUMMARY OF THE INVENTION 

5 The instant invention relates to isolated nucleic acid fragments encoding carotenoid 

biosynthetic enzymes. Specifically, this invention concerns an isolated nucleic acid 
fragment encoding a phytoene synthase or a zeaxanthin epoxidase. In addition, this 
invention relates to a nucleic acid fragment that is complementary to the nucleic acid 
fragment encoding phytoene synthase or zeaxanthin epoxidase. 

10 An additional embodiment of the instant invention pertains to a polypeptide encoding 

all or a substantial portion of a carotenoid biosynthetic enzyme selected from the group 
consisting of phytoene synthase and zeaxanthin epoxidase. 

In another embodiment, the instant invention relates to a chimeric gene encoding a 
phytoene synthase or a zeaxanthin epoxidase, or to a chimeric gene that comprises a nucleic 

15 acid fragment that is complementary to a nucleic acid fragment encoding a phytoene 
synthase or a zeaxanthin epoxidase, operably linked to suitable regulatory sequences, 
wherein expression of the chimeric gene results in production of levels of the encoded 
protein in a transformed host cell that is altered (i.e., increased or decreased) from the level 
produced in an untransformed host cell. 

20 In a further embodiment, the instant invention concerns a transformed host cell 

comprising in its genome a chimeric gene encoding a phytoene synthase or a zeaxanthin 
epoxidase, operably linked to suitable regulatory sequences. Expression of the chimeric 
gene results in production of altered levels of the encoded protein in the transformed host 
cell. The transformed host cell can be of eukaryotic or prokaryotic origin, and include cells 

25 derived from higher plants and microorganisms. The invention also includes transformed 
plants that arise from transformed host cells of higher plants, and seeds derived from such 
transformed plants. 

An additional embodiment of the instant invention concerns a method of altering the 
level of expression of a phytoene synthase or a zeaxanthin epoxidase in a transformed host 

30 cell comprising: a) transforming a host cell with a chimeric gene comprising a nucleic acid 
fragment encoding a phytoene synthase or a zeaxanthin epoxidase; and b) growing the 
transformed host cell under conditions that are suitable for expression of the chimeric gene 
wherein expression of the chimeric gene results in production of altered levels of phytoene 
synthase or zeaxanthin epoxidase in the transformed host cell. 

35 An addition embodiment of the instant invention concerns a method for obtaining a 

nucleic acid fragment encoding all or a substantial portion of an amino acid sequence 
encoding a phytoene synthase or a zeaxanthin epoxidase. 

A further embodiment of the instant invention is a method for evaluating at least one 
compound for its ability to inhibit the activity of a phytoene synthase or a zeaxanthin 

2 
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epoxidase, the method comprising the steps of: (a) transforming a host ceil with a chimeric 
gene comprising a nucleic acid fragment encoding a phytoene synthase or a zeaxanthin 
epoxidase, operably linked to suitable regulatory sequences; (b) growing the transformed 
host cell under conditions that are suitable for expression of the chimeric gene wherein 
5 expression of the chimeric gene results in production of phytoene synthase or zeaxanthin 
epoxidase in the transformed host cell; (c) optionally purifying the phytoene synthase or the 
zeaxanthin epoxidase expressed by the transformed host cell; (d) treating the phytoene 
synthase or the zeaxanthin epoxidase with a compound to be tested; and (e) comparing the 
activity of the phytoene synthase or the zeaxanthin epoxidase that has been treated with a 
10 test compound to the activity of an untreated phytoene synthase or zeaxanthin epoxidase, 
thereby selecting compounds with potential for inhibitory activity. 

BRIEF DESCRIPTION OF THE 
DRAWING AND SEQUENCE DESCRIPTIONS 
The invention can be more fully understood from the following detailed description 
15 and the accompanying drawing and Sequence Listing which form a part of this application. 

Figure 1 depicts the amino acid sequence alignment between the phytoene synthase 
from corn contig assembled of clones csil.pk0034.d8 and P 0008xb31d95rb (SEQ ID NO:2), 
soybean clone sI2.pk0045.blO (SEQ ID NO: 14), Lycopersicon esculentum (NCBI gi 
Accession No. 585747, SEQ ID NO:27) and Zea mays (NCBI gi Accession No. 1346883, 
20 SEQ ID NO:28). Amino acids which are conserved among all sequences are indicated with 
an asterisk (*). Dashes are used by the program to maximize alignment of the sequences. 

The following sequence descriptions and Sequence Listing attached hereto comply 
with the rules governing nucleotide and/or amino acid sequence disclosures in patent 
applications as set forth in 37 C.F.R. §1.821-1.825. 
25 SEQ ID NO: 1 is the nucleotide sequence comprising the contig assembled from the 

entire cDNA insert in clone csil.pk0034.d8 and a portion of the cDNA insert in clone 
p0008xb31d95rb encoding an entire corn phytoene synthase 2. 

SEQ ID NO:2 is the deduced amino acid sequence of an entire corn phytoene synthase 
2 derived from the nucleotide sequence of SEQ ID NO: 1 . 
30 SEQ ID NO:3 is the nucleotide sequence comprising the contig assembled from a 

portion of the cDNA insert in clones p0121.cfrmo87r, p0091.cmarc67r and p0005,cbmej22r 
encoding almost half a corn phytoene synthase. 

SEQ ID NO:4 is the deduced amino acid sequence of almost half a com phytoene 
synthase derived from the nucleotide sequence of SEQ ID NO:3. 
35 SEQ ID NO:5 is the nucleotide sequence comprising the contig assembled from a 

portion of the cDNA insert in clones rdslc.pk005.15, rlr6.pk0028.g3 and rds2c.pk007.fl 6 
encoding the N-terminal 40% of a rice phytoene synthase. 

SEQ ID NO:6 is the deduced amino acid sequence of the N-terminal 40% of a rice 
phytoene synthase derived from the nucleotide sequence of SEQ ID NO:5. 

3 
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SEQ ID NO: 7 is the nucleotide sequence comprising the contig assembled from a 
portion of the cDNA insert in clones rl0n.pkl09.j7 and rl0n.pkl20.p4 encoding a portion of 
a rice phytoene synthase 2. 

SEQ ID NO:8 is the deduced amino acid sequence of a portion of a rice phytoene 
5 synthase 2 derived from the nucleotide sequence of SEQ ID NO:7. 

SEQ ID NO:9 is the nucleotide sequence comprising the contig assembled from the 
entire cDNA insert in clone rlO.pk0005.e5 and a portion of the cDNA insert in clones 
rcaln.pk001.18 and rlmln.pk001.a4 encoding the C-terminal two thirds of a rice phytoene 
. synthase. 

10 SEQ ID NO: 10 is the deduced amino acid sequence of the C-terminal two thirds of a 

rice phytoene synthase derived from the nucleotide sequence of SEQ ID NO:9. 

SEQ ID NO:l 1 is the nucleotide sequence comprising the entire cDNA insert in clone 
sll.pk0029.h5 encoding the C-terminal two thirds of a soybean phytoene synthase 2. 

SEQ ID NO: 12 is the deduced amino acid sequence of the C-terminal two thirds of a 
15 soybean phytoene synthase 2 derived from the nucleotide sequence of SEQ ID NO: 1 1 . 

SEQ ID NO: 13 is the nucleotide sequence comprising the entire cDNA insert in clone 
sl2.pk0045.bl0 encoding an entire soybean phytoene synthase. 

SEQ ID NO: 14 is the deduced amino acid sequence of an entire soybean phytoene 
synthase derived from the nucleotide sequence of SEQ ID NO: 13. 
20 SEQ ID NO: 15 is the nucleotide sequence comprising the entire cDNA insert in clone 

wrl.pk0139.g3 encoding the C-terminal two thirds of a wheat phytoene synthase 2. 

SEQ ID NO: 16 is the deduced amino acid sequence of the C-terminal two thirds of a 
wheat phytoene synthase 2 derived from the nucleotide sequence of SEQ ID NO: 15. 

SEQ ID NO: 17 is the nucleotide sequence comprising the contig assembled from the 
25 entire cDNA insert in clone cbn2.pk005 1 .e8 and a portion of the cDNA insert in clones 
p0031 .ccmaj44r and p0097.cqrag63r encoding a portion of a corn zeaxanthin epoxidase. 

SEQ ID NO: 18 is the deduced amino acid sequence of a portion of a corn zeaxanthin 
epoxidase derived from the nucleotide sequence of SEQ ID NO: 17. 

SEQ ID NO: 19 is the nucleotide sequence comprising the contig assembled from the 
30 entire cDNA insert in clone crln.pk0033.d8 and a portion of the cDNA insert in clones 

pOl lO.cgsmpOlr, p0012.cglae05r and p0088.clrim55r encoding the C-terminal half of a corn 

zeazanthin epoxidase. 

SEQ ID NO:20 is the deduced amino acid sequence of the C-terminal half of a com 
zeazanthin epoxidase derived from the nucleotide sequence of SEQ ID NO: 19. 
35 SEQ ID NO:21 is the nucleotide sequence comprising the entire cDNA insert in clone 

sll.pk0015.c4 encoding a portion of a soybean zeaxanthin epoxidase. 

SEQ ID NO:22 is the deduced amino acid sequence of a portion of a soybean 
zeaxanthin epoxidase derived from the nucleotide sequence of SEQ ID NO:21. 

4 
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SEQ ID NO:23 is the nucleotide sequence comprising the S'-terminal portion of the 
cDNA insert in clone sl2.pk0 1 09.b6 encoding the N-terminal three quarters of a soybean 

zeaxanthin epoxidase. 

SEQ ID NO:24 is the deduced amino acid sequence of the N-terminal three quarters of 
5 a soybean zeaxanthin epoxidase. derived from the nucleotide sequence of SEQ ID NO:23 . 

SEQ ID NO:25 is the nucleotide sequence comprising the 3'-terminal portion of the 
cDNA insert in clone sl2.pk0109.b6 encoding the C-terminal fifth of a soybean zeaxanthin 
epoxidase. 

SEQ ID NO:26 is the deduced amino acid sequence of the C-terminal fifth of a 

10 soybean zeaxanthin epoxidase derived from the nucleotide sequence of SEQ ID NO:25. 

SEQ ID NO:27 is the amino acid sequence of a Lycopersicon esculentum phytoene 
synthase, NCBI gi Accession No. 585747. 

SEQ ID NO:28 is the amino acid sequence of a Cucumis melo phytoene synthase, 
NCBI gi Accession No. 1346882. 

15 The Sequence Listing contains the one letter code for nucleotide sequence characters 

and the three letter codes for amino acids as defined in conformity with the IUPAC-IUBMB 
standards described in Nucleic Acids Research 73:3021-3030 (1985) and in the Biochemical 
Journal 219 (No. 2^:345-373 (1984) which are herein incorporated by reference. The 
symbols and format used for nucleotide and amino acid sequence data comply with the rules 

20 set forth in 37 C.F.R. §1 .822. 

DETAILED DESCRIPTION OF THE INVENTION 
In the context of this disclosure, a number of terms shall be utilized. As used herein, 
an "isolated nucleic acid fragment" is a polymer of UNA or DNA that is single- or double- 
stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An 

25 isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or 
more segments of cDNA, genomic DNA or synthetic DNA. As used herein, "contig" refers 
to an assemblage of overlapping nucleic acid sequences to form one contiguous nucleotide 
sequence. For example, several DNA sequences can be compared and aligned to identify 
common or overlapping regions. The individual sequences can then be assembled into a 

30 single contiguous nucleotide sequence. 

As used herein, "substantially similar" refers to nucleic acid fragments wherein 
changes in one or more nucleotide bases results in substitution of one or more amino acids, 
but do not affect the functional properties of the protein encoded by the DNA sequence. 
"Substantially similar" also refers to nucleic acid fragments wherein changes in one or more 

35 nucleotide bases does not affect the ability of the nucleic acid fragment to mediate alteration 
of gene expression by antisense or co-suppression technology. "Substantially similar" also 
refers to modifications of the nucleic acid fragments of the instant invention such as deletion 
or insertion of one or more nucleotides that do not substantially affect the functional 
properties of the resulting transcript vis-a-vis the ability to mediate alteration of gene 

5 



WO 99/55889 PCT/US99/08789 

expression by antisense or co-suppression technology or alteration of the functional 
properties of the resulting protein molecule. It is therefore understood that the invention 
encompasses more than the specific exemplary sequences. 

For example, it is well known in the art that antisense suppression and co-suppression 

5 of gene expression may be accomplished using nucleic acid fragments representing less than 
the entire coding region of a gene, and by nucleic acid fragments that do not share 100% 
sequence identity with the gene to be suppressed. Moreover, alterations in a gene which 
result in the production of a chemically equivalent amino acid at a given site, but do not 
effect the functional properties of the encoded protein, are well known in the art. Thus, a 

10 codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon 
encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, 
such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one 
negatively charged residue for another, such as aspartic acid for glutamic acid, or one 
positively charged residue for another, such as lysine for arginine, can also be expected to 

15 produce a functionally equivalent product. Nucleotide changes which result in alteration of 
the N-terminal and C-terminal portions of the protein molecule would also not be expected 
to alter the activity of the protein. Each of the proposed modifications is well within the 
routine skill in the art, as is determination of retention of biological activity of the encoded 
products. Moreover, substantially similar nucleic acid fragments may also be characterized 

20 by their ability to hybridize, under stringent conditions (0.1 X SSC, 0. 1 % SDS, 65°C), with 
the nucleic acid fragments disclosed herein. 

Substantially similar nucleic acid fragments of the instant invention may also be 
characterized by the percent similarity of the amino acid sequences that they encode to the 
amino acid sequences disclosed herein, as determined by algorithms commonly employed by 

25 those skilled in this art. Preferred are those nucleic acid fragments whose nucleotide 

sequences encode amino acid sequences that are 80% similar to the amino acid sequences 
reported herein. More preferred nucleic acid fragments encode amino acid sequences that 
are 90% similar to the amino acid sequences reported herein. Most preferred are nucleic 
acid fragments that encode amino acid sequences that are 95% similar to the amino acid 

30 sequences reported herein. Sequence alignments and percent similarity calculations were 

performed using the Megalign program of the LASARGENE bioinformatics computing suite 
(DNASTAR Inc., Madison, WI). Multiple alignment of the amino acid sequences was 
performed using the Clustal method of alignment (Higgins, D. G. and Sharp, P. M. (1989) 
CABIOS. 5:151-153) with the default parameters (GAP PEN ALT Y= 10, GAP LENGTH 

35 PENALTY=10). Default parameters for pairwise alignments using the Clustal method were 
KTUPLE 1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. 

A "substantial portion" of an amino acid or nucleotide sequence comprises enough of 
the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to afford 
putative identification of that polypeptide or gene, either by manual evaluation of the 

6 



WO 99/55889 PCT/US99/08789 

sequence by one skilled in the art, or by computer-automated sequence comparison and 
identification using algorithms such as BLAST (Basic Local Alignment Search Tool; 
Altschul, S. F., et ah, (1993) J. Mol Biol 275:403-410; see also 

www.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or more contiguous amino 

5 acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide 
or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect 
to nucleotide sequences, gene specific oligonucleotide probes comprising 20-30 contiguous 
nucleotides may be used in sequence-dependent methods of gene identification (e.g., 
Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or 

10 bacteriophage plaques). In addition, short oligonucleotides of 12-1 5 bases may be used as 
amplification primers in PCR in order to obtain a particular nucleic acid fragment 
comprising the primers. Accordingly, a "substantial portion" of a nucleotide sequence 
comprises enough of the sequence to afford specific identification and/or isolation of a 
nucleic acid fragment comprising the sequence. The instant specification teaches partial or 

15 complete amino acid and nucleotide sequences encoding one or more particular plant 
proteins. The skilled artisan, having the benefit of the sequences as reported herein, may 
now use all or a substantial portion of the disclosed sequences for purposes known to those 
skilled in this art. Accordingly, the instant invention comprises the complete sequences as 
reported in the accompanying Sequence Listing, as well as substantial portions of those 

20 sequences as defined above. 

"Codon degeneracy" refers to divergence in the genetic code permitting variation of 
the nucleotide sequence without effecting the amino acid sequence of an encoded 
polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment that 
encodes all or a substantial portion of the amino acid sequence encoding the phytoene 

25 synthase or the zeaxanthin epoxidase proteins as set forth in SEQ ID NOs:2, 4, 6, 8, 10, 12, 
14, 16, 18, 20, 22, 24 and 26. The skilled artisan is well aware of the "codon-bias" exhibited 
by a specific host cell in usage of nucleotide codons to specify a given amino acid. 
Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to 
design the gene such that its frequency of codon usage approaches the frequency of 

30 preferred codon usage of the host cell. 

"Synthetic genes" can be assembled from oligonucleotide building blocks that are 
chemically synthesized using procedures known to those skilled in the art. These building 
blocks are ligated and annealed to form gene segments which are then enzymatically 
assembled to construct the entire gene. "Chemically synthesized", as related to a sequence 

35 of DNA, means that the component nucleotides were assembled in vitro. Manual chemical 
synthesis of DNA may be accomplished using well established procedures, or automated 
chemical synthesis can be performed using one of a number of commercially available 
machines. Accordingly, the genes can be tailored for optimal gene expression based on 
optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled 
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artisan appreciates the likelihood of successful gene expression if codon usage is biased 
towards those codons favored by the host. Determination of preferred codons can be based 
on a survey of genes derived from the host cell where sequence information is available. 
"Gene" refers to a nucleic acid fragment that expresses a specific protein, including 
5 regulatory sequences preceding (5* non-coding sequences) and following (3 1 non-coding 
sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its 
own regulatory sequences. "Chimeric gene" refers any gene that is not a native gene, 
comprising regulatory and coding sequences that are not found together in nature. 
Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that 

10 are derived from different sources, or regulatory sequences and coding sequences derived 
from the same source, but arranged in a manner different than that found in nature. 
"Endogenous gene" refers to a native gene in its natural location in the genome of an 
organism. A "foreign" gene refers to a gene not normally found in the host organism, but 
that is introduced into the host organism by gene transfer. Foreign genes can comprise 

15 native genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene 
that has been introduced into the genome by a transformation procedure. 

"Coding sequence" refers to a DNA sequence that codes for a specific amino acid 
sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5* non- 
coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, 

20 and which influence the transcription, RNA processing or stability, or translation of the 
associated coding sequence. Regulatory sequences may include promoters, translation 
leader sequences, introns, and polyadenylation recognition sequences. 

"Promoter" refers to a DNA sequence capable of controlling the expression of a 
coding sequence or functional RNA. In general, a coding sequence is located 3 f to a 

25 promoter sequence. The promoter sequence consists of proximal and more distal upstream 
elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a 
DNA sequence which can stimulate promoter activity and may be an innate element of the 
promoter or a heterologous element inserted to enhance the level or tissue-specificity of a 
promoter. Promoters may be derived in their entirety from a native gene, or be composed of 

30 different elements derived from different promoters found in nature, or even comprise 

synthetic DNA segments. It is understood by those skilled in the art that different promoters 
may direct the expression of a gene in different tissues or cell types, or at different stages of 
development, or in response to different environmental conditions. Promoters which cause a 
gene to be expressed in most cell types at most times are commonly referred to as 

35 "constitutive promoters". New promoters of various types useful in plant cells are 
constantly being discovered; numerous examples may be found in the compilation by 
Okamuro and Goldberg, ( 1 989) Biochemistry of Plants 15: 1 -82. It is further recognized that 
since in most cases the exact boundaries of regulatory sequences have not been completely 
defined, DNA fragments of different lengths may have identical promoter activity. 
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The "translation leader sequence" refers to a DN A sequence located between the 
promoter sequence of a gene and the coding sequence. The translation leader sequence is 
present in the fully processed mRNA upstream of the translation start sequence. The 
translation leader sequence may affect processing of the primary transcript to mRNA, 

5 mRNA stability or translation efficiency. Examples of translation leader sequences have 
been described (Turner, R. and Foster, G. D. (1995) Molecular Biotechnology 5:225). 

The "3' non-coding sequences" refer to DNA sequences located downstream of a 
coding sequence and include polyadenylation recognition sequences and other sequences 
encoding regulatory signals capable of affecting mRNA processing or gene expression. The 

10 polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid 
tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is 
exemplified by Ingelbrecht et ah, (1989) Plant Cell 7:671-680. 

U RNA transcript" refers to the product resulting from RNA polymerase-catalyzed 
transcription of a DNA sequence. When the RNA transcript is a perfect complementary 

15 copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA 
sequence derived from posttranscriptional processing of the primary transcript and is 
referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is 
without introns and that can be translated into protein by the cell. "cDNA" refers to a 
double-stranded DNA that is complementary to and derived from mRNA. "Sense" RNA 

20 refers to RNA transcript that includes the mRNA and so can be translated into protein by the 
cell. "Antisense RNA" refers to a RNA transcript that is complementary to all or part of a 
target primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat. 
No. 5,107,065, incorporated herein by reference). The complementarity of an antisense 
RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 

25 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to sense 
RNA, antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has 
an effect on cellular processes. 

The term "operably linked" refers to the association of nucleic acid sequences on a 
single nucleic acid fragment so that the function of one is affected by the other. For 

30 example, a promoter is operably linked with a coding sequence when it is capable of 

affecting the expression of that coding sequence (i.e., that the coding sequence is under the 
transcriptional control of the promoter). Coding sequences can be operably linked to 
regulatory sequences in sense or antisense orientation. 

The term "expression", as used herein, refers to the transcription and stable 

35 accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of 
the invention. Expression may also refer to translation of mRNA into a polypeptide. 
"Antisense inhibition" refers to the production of antisense RNA transcripts capable of 
suppressing the expression of the target protein. "Overexpression" refers to the production 
of a gene product in transgenic organisms that exceeds levels of production in normal or 
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non-transformed organisms. "Co-suppression" refers to the production of sense RNA 
transcripts capable of suppressing the expression of identical or substantially similar foreign 
or endogenous genes (U.S. Patent No. 5,23 1,020, incorporated herein by reference). 

"Altered levels" refers to the production of gene produces) in transgenic organisms in 
5 amounts or proportions that differ from that of normal or non-transformed organisms. 

"Mature" protein refers to a post-translationally processed polypeptide; i.e., one from 
which any pre- or propeptides present in the primary translation product have been removed. 
"Precursor" protein refers to the primary product of translation of mRNA; i.e., with pre- and 
propeptides still present. Pre- and propeptides may be but are not limited to intracellular 

10 localization signals. 

A "chloroplast transit peptide" is an amino acid sequence which is translated in 
conjunction with a protein and directs the protein to the chloroplast or other plastid types 
present in the cell in which the protein is made. "Chloroplast transit sequence" refers to a 
nucleotide sequence that encodes a chloroplast transit peptide. A "signal peptide" is an 

15 amino acid sequence which is translated in conjunction with a protein and directs the protein 
to the secretory system (Chrispeels, J. J., (1991) Ann. Rev. Plant Phys. Plant MoL Biol 
42:21-53). If the protein is to be directed to a vacuole, a vacuolar targeting signal (supra) 
can further be added, or if to the endoplasmic reticulum, an endoplasmic reticulum retention 
signal (supra) may be added. If the protein is to be directed to the nucleus, any signal 

20 peptide present should be removed and instead a nuclear localization signal included 
(Raikhel (1992) Plant Phys. 700:1627-1632). 

"Transformation" refers to the transfer of a nucleic acid fragment into the genome of a 
host organism, resulting in genetically stable inheritance. Host organisms containing the 
transformed nucleic acid fragments are referred to as "transgenic" organisms. Examples of 

25 methods of plant transformation include Agrobacterium-mediated transformation (De Blaere 
et al. (1987) Meth Enzymol 143:711) and particle-accelerated or "gene gun" transformation 
technology (Klein T. M. et al. (1987) Nature (London) 327:10-13', U.S. Patent 
No. 4,945,050). 

Standard recombinant DNA and molecular cloning techniques used herein are well 
30 known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and 

Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory 

Press: Cold Spring Harbor, 1989 (hereinafter "Maniatis"). 

Nucleic acid fragments encoding at least a portion of several carotenoid biosynthetic 

enzymes have been isolated and identified by comparison of random plant cDN A sequences 
35 to public databases containing nucleotide and protein sequences using the BLAST 

algorithms well known to those skilled in the art Table 1 lists the proteins that are described 

herein, and the designation of the cDN A clones that comprise the nucleic acid fragments 

encoding these proteins. 
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TABLE 1 
Carotenoid Biosynthetic Enzymes 



Enzyme 



Clone 



Plant 



Phytoene Synthase 



Zeaxanthin Epoxidase 



Contig of: 

p0008.cb3ld95rb 
csil.pk0034.d8 

Contig of: 

p0121.cfimo87r 
p0091.cmarc67r 
p0005.cbmej22r 

Contig of: 

rdslc.pk005.15 
rlr6.pk0028.g3 
rds2c.pk007.fl6 

Contig of: 

rl0n.pkl09.j7 
rl0n.pkl20.p4 

Contig of: 

rlmln.pk001.a4 

rcaln.pk001.18 

rl0.pk0005.e5 

sli.pk0029.h5 

sl2.pk0045.bl0 

wrl.pk0139.g3 

contig of: 

cbn2.pk005 1 .e8 
p0031.ccmaj44r 
p0097.cqrag63r 

Contig of: 

pOUO.cgsmpOlr 
p0012.cglae05r 
p0088.clrim55r 
crln.pk0033.d8 

sll.pk0015x4 
sl2.pk0109.b6 



Corn 



Corn 



Rice 



Rice 



Rice 



Soybean 
Soybean 
Wheat 
Corn 



Corn 



Soybean 
Soybean 



The nucleic acid fragments of the instant invention may be used to isolate cDNAs and 
genes encoding homologous proteins from the same or other plant species. Isolation of 
homologous genes using sequence-dependent protocols is well known in the art. Examples 
of sequence-dependent protocols include, but are not limited to, methods of nucleic acid 
hybridization, and methods of DNA and RNA amplification as exemplified by various uses 
of nucleic acid amplification technologies (e.g., polymerase chain reaction, ligase chain 
reaction). 

For example, genes encoding other phytoene synthases or zeaxanthin epoxidases, 
either as cDNAs or genomic DN As, could be isolated directly by using all or a portion of the 
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instant nucleic acid fragments as DNA hybridization probes to screen libraries from any 
desired plant employing methodology well known to those skilled in the art. Specific 
oligonucleotide probes based upon the instant nucleic acid sequences can be designed and 
synthesized by methods known in the art (Maniatis). Moreover, the entire sequences can be 

5 used directly to synthesize DNA probes by methods known to the skilled artisan such as 
random primer DNA labeling, nick translation, or end-labeling techniques, or RNA probes 
using available in vitro transcription systems. In addition, specific primers can be designed 
and used to amplify a part or all of the instant sequences. The resulting amplification 
products can be labeled directly during amplification reactions or labeled after amplification 

10 reactions, and used as probes to isolate full length cDNA or genomic fragments under 
conditions of appropriate stringency. 

In addition, two short segments of the instant nucleic acid fragments may be used in 
polymerase chain reaction protocols to amplify longer nucleic acid fragments encoding 
homologous genes from DNA or RNA. The polymerase chain reaction may also be 

15 performed on a library of cloned nucleic acid fragments wherein the sequence of one primer 
is derived from the instant nucleic acid fragments, and the sequence of the other primer takes 
advantage of the presence of the polyadenylic acid tracts to the 3' end of the mRNA 
precursor encoding plant genes. Alternatively, the second primer sequence may be based 
upon sequences derived from the cloning vector. For example, the skilled artisan can follow 

20 the RACE protocol (Frohman et ah, (1988) Proc. Natl Acad Sci. USA 55:8998) to generate 
cDNAs by using PCR to amplify copies of the region between a single point in the transcript 
and the 3 1 or 5' end. Primers oriented in the 3' and 5 f directions can be designed from the 
instant sequences. Using commercially available 3' RACE or 5' RACE systems (BRL), 

■ 

specific 3' or 5' cDNA fragments can be isolated (Ohara et al, (1989) Proc. Natl. Acad. ScL 
25 USA 56:5673; Loh et al., (1989) Science 243:217). Products generated by the 3* and 5' 
RACE procedures can be combined to generate full-length cDNAs (Frohman, M. A. and 
Martin, G. R., (1989) Techniques 7:165). 

Availability of the instant nucleotide and deduced amino acid sequences facilitates 
immunological screening of cDNA expression libraries. Synthetic peptides representing 
30 portions of the instant amino acid sequences may be synthesized. These peptides can be 
used to immunize animals to produce polyclonal or monoclonal antibodies with specificity 
for peptides or proteins comprising the amino acid sequences. These antibodies can be then 
be used to screen cDN A expression libraries to isolate full-length cDNA clones of interest 
(Lemer, R. A. (1984) Adv. Immunol. 56:1; Maniatis). 
35 The nucleic acid fragments of the instant invention may be used to create transgenic 

plants in which the disclosed phytoene synthase or zeaxanthin epoxidase are present at 
higher or lower levels than normal or in cell types or developmental stages in which they are 
not normally found. This would have the effect of altering the level of lycopene or 
zeaxanthin in those cells. Because the nucleotide sequence of corn clone csil.pk0034.d8 is 
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so divergent from known phytoene synthase genes it may be possible to overexpress it in 
transgenic plants without causing co-supression. Co-supression of phytoene synthase in rice 
may re-direct the carbon flux towards tocopherol biosynthesis to improve the grain eating 
qualities. Manipulation of the levels of zeaxanthin epoxidase in transgenic corn may result 

5 in higher levels of zeaxanthin, an important ingredient in animal feed. 

Overexpression of the phytoene synthase or the zeaxanthin epoxidase proteins of the 
instant invention may be accomplished by first constructing a chimeric gene in which the 
coding region is operably linked to a promoter capable of directing expression of a gene in 
the desired tissues at the desired stage of development. For reasons of convenience, the 

10 chimeric gene may comprise promoter sequences and translation leader sequences derived 
from the same genes. 3' Non-coding sequences encoding transcription termination signals 
may also be provided. The instant chimeric gene may also comprise one or more introns in 
order to facilitate gene expression. 

Plasmid vectors comprising the instant chimeric gene can then constructed. The 

15 choice of plasmid vector is dependent upon the method that will be used to transform host 
plants. The skilled artisan is well aware of the genetic elements that must be present on the 
plasmid vector in order to successfully transform, select and propagate host cells containing 
the chimeric gene. The skilled artisan will also recognize that different independent 
transformation events will result in different levels and patterns of expression (Jones et ah, 

20 (1985) EMBO J. 4:241 1-241 8; De Almeida et al, (1989) Mol Gen. Genetics 275:78-86), 
and thus that multiple events must be screened in order to obtain lines displaying the desired 
expression level and pattern. Such screening may be accomplished by Southern analysis of 
DNA, Northern analysis of mRNA expression, Western analysis of protein expression, or 
phenotypic analysis. 

25 For some applications it may be useful to direct the instant carotenoid biosynthetic 

enzyme to different cellular compartments, or to facilitate its secretion from the cell. It is 
thus envisioned that the chimeric gene described above may be further supplemented by 
altering the coding sequence to encode phytoene synthase or zeaxanthin epoxidase with 
appropriate intracellular targeting sequences such as transit sequences (Keegstra, K. (1989) 

30 Cell 56:247-253), signal sequences or sequences encoding endoplasmic reticulum 

localization (Chrispeels, J. J., (1991) Ann. Rev. Plant Phys. Plant Mol Biol 42:21-53), or 
nuclear localization signals (Raikhel, N. (1992) Plant Phys. 700:1627-1632) added and/or 
with targeting sequences that are already present removed. While the references cited give 
examples of each of these, the list is not exhaustive and more targeting signals of utility may 

35 be discovered in the future. 

It may also be desirable to reduce or eliminate expression of genes encoding phytoene 
synthase or zeaxanthin epoxidase in plants for some applications. In order to accomplish 
this, a chimeric gene designed for co-suppression of the instant carotenoid biosynthetic 
enzyme can be constructed by linking a gene or gene fragment encoding a phytoene 
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synthase or a zeaxanthin epoxidase to plant promoter sequences. Alternatively, a chimeric 
gene designed to express antisense RNA for all or part of the instant nucleic acid fragment 
can be constructed by linking the gene or gene fragment in reverse orientation to plant 
promoter sequences. Either the co-suppression or antisense chimeric genes could be 
introduced into plants via transformation wherein expression of the corresponding 
endogenous genes are reduced or eliminated. 

The instant phytoene synthase or zeaxanthin epoxidase (or portions thereof) may be 
produced in heterologous host cells, particularly in the cells of microbial hosts, and can be 
used to prepare antibodies to the these proteins by methods well known to those skilled in 
the art. The antibodies are useful for detecting phytoene synthase or zeaxanthin epoxidase 
in situ in cells or in vitro in cell extracts. Preferred heterologous host cells for production of 
the instant phytoene synthase or zeaxanthin epoxidase are microbial hosts. Microbial 
expression systems and expression vectors containing regulatory sequences that direct high 
level expression of foreign proteins are well known to those skilled in the art. Any of these 
could be used to construct a chimeric gene for production of the instant phytoene synthase or 
zeaxanthin epoxidase. This chimeric gene could then be introduced into appropriate 
microorganisms via transformation to provide high level expression of the encoded 
carotenoid biosynthetic enzyme. An example of a vector for high level expression of the 
instant phytoene synthase or zeaxanthin epoxidase in a bacterial host is provided 
(Example 7). 

Additionally, the instant phytoene synthase or zeaxanthin epoxidase can be used as 
targets to facilitate design and/or identification of inhibitors of those enzymes that may be 
useful as herbicides. This is desirable because the phytoene synthase or the zeaxanthin 
epoxidase described herein catalyze various steps in carotenoid biosynthesis. Accordingly, 
inhibition of the activity of one or more of the enzymes described herein could lead to 
inhibition plant growth. Thus, the instant phytoene synthase or zeaxanthin epoxidase could 
be appropriate for new herbicide discovery and design. 

All or a substantial portion of the nucleic acid fragments of the instant invention may 
also be used as probes for genetically and physically mapping the genes that they are a part 
of, and as markers for traits linked to those genes. Such information may be useful in plant 
breeding in order to develop lines with desired phenotypes. For example, the instant nucleic 
acid fragments may be used as restriction fragment length polymorphism (RFLP) markers. 
Southern blots (Maniatis) of restriction-digested plant genomic DNA may be probed with 
the nucleic acid fragments of the instant invention. The resulting banding patterns may then 
be subjected to genetic analyses using computer programs such as MapMaker (Lander et at., 
(1987) Genomics 7:174-181) in order to construct a genetic map. In addition, the nucleic 
acid fragments of the instant invention may be used to probe Southern blots containing 
restriction endonuclease-treated genomic DNAs of a set of individuals representing parent 
and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted 
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and used to calculate the position of the instant nucleic acid sequence in the genetic map 
previously obtained using this population (Botstein, D. et al., (1980) Am. J. Hum. Genet. 
52:314-331). 

The production and use of plant gene-derived probes for use in genetic mapping is 
described in R. Bematzky, R. and Tanksley, S. D. (1986) Plant Mol. Biol. Reporter 
4(7^:37-41. Numerous publications describe genetic mapping of specific cDNA clones 
using the methodology outlined above or variations thereof. For example, F2 intercross 
populations, backcross populations, randomly mated populations, near isogenic lines, and 
other sets of individuals may be used for mapping. Such methodologies are well known to 

those skilled in the art. 

Nucleic acid probes derived from the instant nucleic acid sequences may also be used 
for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel, J. D., et 
al., In: Nonmammalian Genomic Analysis: A Practical Guide, Academic press 1996, 
pp. 3 1 9-346, and references cited therein). 

In another embodiment, nucleic acid probes derived from the instant nucleic acid 
sequences may be used in direct fluorescence in situ hybridization (FISH) mapping (Trask, 
B. J. (1991) Trends Genet 7:149-154). Although current methods of FISH mapping favor 
use of large clones (several to several hundred KB; see Laan, M. et al. (1995) Genome 
Research 5:13-20), improvements in sensitivity may allow performance of FISH mapping 

using shorter probes. 

A variety of nucleic acid amplification-based methods of genetic and physical 
mapping may be carried out using the instant nucleic acid sequences. Examples include 
allele-specific amplification (Kazazian, H. H. (1989) J. Lab. Clin. Med. 1 14(2):95-96), 
polymorphism of PCR-amplified fragments (CAPS; Sheffield, V. C. et al. (1993) Genomics 
76:325-332), allele-specific ligation (Landegren, U. et al. (1988) Science 247:1077-1080), 
nucleotide extension reactions (Sokolov, B. P. (1990) Nucleic Acid Res. 75:3671), Radiation 
Hybrid Mapping (Walter, M. A. et al. (1997) Nature Genetics 7:22-28) and Happy Mapping 
(Dear, P. H. and Cook, P. R. (1989) Nucleic Acid Res. 1 7:6795-6807). For these methods, 
the sequence of a nucleic acid fragment is used to design and produce primer pairs for use in 
the amplification reaction or in primer extension reactions. The design of such primers is 
well known to those skilled in the art. In methods employing PCR-based genetic mapping, 
it may be necessary to identify DNA sequence differences between the parents of the 
mapping cross in the region corresponding to the instant nucleic acid sequence. This, 
however, is generally not necessary for mapping methods. 

Loss of function mutant phenotypes may be identified for the instant cDNA clones 
either by targeted gene disruption protocols or by identifying specific mutants for these 
genes contained in a maize population carrying mutations in all possible genes (Ballinger 
and Benzer, (1989) Proc. Natl Acad. Sci USA 56:9402; Koes et al., (1995) Proc. Natl. Acad. 
Sci USA 92:8149; Bensen et al., (1995) Plant Cell 7:75). The latter approach may be 
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accomplished in two ways. First, short segments of the instant nucleic acid fragments may 
be used in polymerase chain reaction protocols in conjunction with a mutation tag sequence 
primer on DNAs prepared from a population of plants in which Mutator transposons or some 
other mutation-causing DNA element has been introduced (see Bensen, supra). The 

5 amplification of a specific DNA fragment with these primers indicates the insertion of the 
mutation tag element in or near the plant gene encoding the phytoene synthase or the 
zeaxanthin epoxidase. Alternatively, the instant nucleic acid fragment may be used as a 
hybridization probe against PCR amplification products generated from the mutation 
population using the mutation tag sequence primer in conjunction with an arbitrary genomic 

10 site primer, such as that for a restriction enzyme site-anchored synthetic adaptor. With 
either method, a plant containing a mutation in the endogenous gene encoding a phytoene 
synthase or a zeaxanthin epoxidase can be identified and obtained. This mutant plant can 
then be used to determine or confirm the natural function of the phytoene synthase or the 
zeaxanthin epoxidase gene product. 

15 EXAMPLES 

The present invention is further defined in the following Examples, in which all parts 
and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be 
understood that these Examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only. From the above discussion and these Examples, one 

20 skilled in the art can ascertain the essential characteristics of this invention, and without 

departing from the spirit and scope thereof, can make various changes and modifications of 
the invention to adapt it to various usages and conditions. 

EXAMPLE 1 

Composition of cDNA Libraries; Isolation and Sequencing of cDNA Clones 
25 cDNA libraries representing mRNAs from various corn, rice, soybean and wheat 

tissues were prepared. The characteristics of the libraries are described below. 

TABLE 2 



cDNA Libraries from Corn, Rice, Soybean and Wheat 



Library 


Tissue 


Clone 


cbn2 


Corn Developing Kernel Two Days After Pollination 


cbn2.pk0051.e8 


crln 


Corn Root From 7 Day Old Seedlings* 


crln.pk0033.d8 


csil 


Corn Silk 


csil.pk0034.d8 


p0005 


Corn Immature Ear 


p0005.cbmej22r 


p0008 


Corn Leaf, 3-Weeks-Old 


p0008.cb31d95rb 


p0012 


Corn Middle 3/4 of the 3rd Leaf Blade and Mid Rib From 
Green Leaves Treated with Jasmonic Acid (1 mg/ml in 
0.02% Tween 20) for 24 Hours Before Collection 


p0012.cglae05r 


p0031 


Corn Shoot Culture 

16 
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L<lUlcu y 


Tissue 


Clone 


pUUoo 


Corn Leaf From Mutant Plant** Prior to Genetic Lesion 


p0088xlrim55r 


Formation 




p0091 


Corn Roots 2 and 3 Days After Germination, Pooled 


p0091.cmarc67r 


p0097 


Corn V9 Whorl Section (7 cm) From Plant Infected Four 


p0097.cqrag63r 


Times With European Corn Borer 




pOUO 


Corn (Stages V3/V4) Leaf Tissue Minus Midrib Harvested pOl lO.cgsmpOlr 


*+ rlOUTS, Zn MO UTS olid f Uaya /Mlcr II1IUIT allUIl W1U1 

Salicylic Acid, Pooled* 




p0121 


Corn Shank Ear Tissue Collected 5 Days After Pollination* 


p0121.cfrmo87r 


rcaln 


Rice Callus* 


rcaln.pk001.18 


rdslc 


Rice Developing Seeds 


rdslc.pk005.15 


rds2c 


Rice Developing Seeds From Middle of the Plant 


rds2c.pk007.fl6 


rlO 


Rice 15 Day Old Leaf 


rlO.pk0005.e5 


rlOn 


Rice 1 5 Day Old Leaf* 


rl0n.pkl09.j7 
rl0n.pkl20.p4 


rlmln 


Rice Leaf 1 5 Days After Germination Harvested 2-72 
Hours Following Infection With Magnaporta grisea 
(4360-R-62 and 4360-R-67) Normalized at 30 Degrees C 
for 24 Hours Using 1 0 Fold Excess Driver 


rlmln.pk001.a4 


rls6 


Rice Leaf 1 5 Days After Germination, 6 Hours After 
Infection of Strain Magaporthe grisea 4360-R-67 (AVR2- 
YAMO); Susceptible 


rlr6.pk0028.g3 


sll 


Soybean Two- Week-Old Developing Seedlings 


sll .pk00 15x4 
sll.pk0029.h5 


sl2 


Soybean Two- Week-Old Developing Seedlings Treated 


sl2.pk0045.bl0 




With 2.5 ppm chlorimuron 


sl2.pk0109.b6 


wrl 


Wheat Root From 7 Day Old Seedling 


wrl.pk0139.g3 



*These libraries were normalized essentially as described in U.S. Patent No. 5,482,845 
**Simmons, C. et al. (1998) Mol. Plant Microbe Interact. 77:11 10-1 118 

5 cDNA libraries were prepared in Uni-ZAP™ XR vectors according to the 

manufacturer's protocol (Stratagene Cloning Systems, La Jolla, CA). Conversion of the 
Uni-ZAP™ XR libraries into plasmid libraries was accomplished according to the protocol 
provided by Stratagene. Upon conversion, cDNA inserts were contained in the plasmid 
vector pBluescript. cDNA inserts from randomly picked bacterial colonies containing 

10 recombinant pBluescript plasmids were amplified via polymerase chain reaction using 
primers specific for vector sequences flanking the inserted cDNA sequences or plasmid 
DNA was prepared from cultured bacterial cells. Amplified insert DNAs or plasmid DNAs 
were sequenced in dye-primer sequencing reactions to generate partial cDNA sequences 

17 



WO 99/55889 



PCT/US99/08789 



(expressed sequence tags or "ESTs"; see Adams, M. D. et al., (1991) Science 252:1651). 
The resulting ESTs were analyzed using a Perkin Elmer Model 377 fluorescent sequencer. 

EXAMPLE 2 
Identification of cDNA Clones 
5 ESTs encoding carotenoid biosynthetic enzymes were identified by conducting 

BLAST (Basic Local Alignment Search Tool; Altschul, S. F., et al., (1993) J. Mol. Biol 
275:403-410; see also www.ncbi.nlm.nih.gov/BLAST/) searches for similarity to sequences 
contained in the BLAST "nr" database (comprising all non-redundant GenBank CDS 
translations, sequences derived from the 3-dimensional structure Brookhaven Protein Data 

10 Bank, the last major release of the SWISS-PROT protein sequence database, EMBL, and 
DDBJ databases). The cDNA sequences obtained in Example 1 were analyzed for similarity 
to all publicly available DNA sequences contained in the "nr" database using the BLASTN 
algorithm provided by the National Center for Biotechnology Information (NCBI). The 
DNA sequences were translated in all reading frames and compared for similarity to all 

15 publicly available protein sequences contained in the "nr" database using the BLASTX 
algorithm (Gish, W. and States, D. J. (1993) Nature Genetics 5:266-272) provided by the 
NCBI. For convenience, the P-value (probability) of observing a match of a cDNA 
sequence to a sequence contained in the searched databases merely by chance as calculated 
by BLAST are reported herein as "pLog" values, which represent the negative of the 

20 logarithm of the reported P-value. Accordingly, the greater the pLog value, the greater the 
likelihood that the cDNA sequence and the BLAST "hit" represent homologous proteins. 

EXAMPLE 3 

Characterization of cDNA Clones Encoding Phvtoene Synthase 
The BLASTX search using the EST sequences from clones csil.pk0034.d8, 

25 ssm.pk001 l.d9, sll.pk0069.e4, sll.pk0029.h5, sll.pk0073.gl0, sll.pk0031.b8 and 

wrl.pk0139.g3 revealed similarity of the proteins encoded by the cDNAs to Phytoene 
Synthase from corn, Arabidopsis thaliana y Lycopersicon esculentum, Cucumis melo, and 
Capsicum annum (GenBank Accession Nos. U32636, L25812, L23424, Z37543, X68017 
respectively). Further analysis of the sequences from clones ssm.pkOOl 1 .d9 and 

30 sll.pk0069.e4 revealed a significant region of overalp, thus affording the assembly of a 

contig encoding a portion of a soybean Phytoene Synthase. Likewise, further analysis of the 
sequences from clones sll.pk0029.h5 and sll.pk0073.gl0 revealed a significant region of 
overalp, thus affording the assembly of an additional contig encoding a portion of a soybean 
Phytoene Synthase. The BLAST results for each of these ESTs and contigs are shown in 

35 Table 3: 
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TABLE 3 



BLAST Results for Clones Encoding Polypeptides Homologous 

to Phvtoene Synthase 







GenBank 




Clone 


Organism 


Accession No. 


BLAST pLog Score 


csil.pk0034.d8 


Maize 


U32636 


33.00 


Contig of: 

ssm.pk0011.d9 
sll.pk0069.e4 


Arabidopsis thaliana 


L25812 


54.40 


Contig of: 

sll.pk0029.h5 
sll.pkO073.gl0 


Lycopersicon esctdentum 


L23424 


20.00 


sll.pk0031.b8 


Cucumis melo 


Z37543 


50.00 


wrl.pk0139.g3 


Capsicum annum 


X68017 


31.70 



5 TBLASTN analysis of the proprietary plant EST database indicated that other corn 

rice and soybean clones besides those mentioned above encoded phytoene synthetase. The 
BLASTX search using the nucleotide sequences of the contig assembled from a portion of 
the cDNA insert in clones p0121.cftmo87r, p0091.cmarc67r and p0005.cbmej22r revealed 
similarity of the proteins encoded by the cDNAs to phytoene synthase from Capsicum 

10 annuum (NCBI gi Accession No. 585749). The BLASTX search using the nucleotide 
sequences of the contig assembled from a portion of the cDNA insert in clones 
rdslc.pk005.15, rlr6.pk0028.g3 and rds2c.pk007.fl6 and of the contig assembled from the 
entire cDNA insert in clone rlO.pk0005.e5 and a portion of the cDNA insert in clones 
rlmln.pk001.a4 and rcaln.pk001.18 revealed similarity of the proteins encoded by the 

15 cDNAs to phytoene synthase from Zea mays (NCBI gi Accession No. 1346883). The 

BLASTX search using the nucleotide sequences from the contig assembled of a portion of 
the cDNA insert in clones rl0n.pkl09.j7 and rl0n.pkl20.p4 revealed similarity of the 
proteins encoded by the cDN As to phytoene synthase 2 from Lycopersicon esculentum 
(NCBI gi Accession No. 585747). BLASTX search using the nucleotide sequences from the 

20 entire cDNA insert in clone sl2.pk0045.bl0 revealed similarity of the proteins encoded by 
the cDNAs to phytoene synthase from Narcissus pseudonarcissus (NCBI gi Accession 
No. 1709885). The BLAST results for each of these sequences are shown in Table 4: 
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TABLE 4 



BLAST Results for Clones Encoding Polypeptides Homologous 
to Phytoene Synthase 



Clone 


Organism 


NCBI gi 
Accessioa No. 


BLAST pLog 
Score 


Contig of: 

p0121.cfrmo87r 
p0091.cmarc67r 
n0005 cbmei22r 


Capsicum annuum 


585749 


89.22 


Contiff of" 

rdslc.pk005.15 
rlr6.pk0028.g3 
rds2c.pk007.fl6 


Zea mays 


1346883 


54.22 


Contig of: 

rl0n.pkl09.j7 
rl0n.pkl20.p4 


Lycopersicon esculentum 


585747 


54.30 


Contig of: 

rlmln.pk001.a4 

rcaln.pk001.18 

rl0.pkOOO5.e5 


Zea mays 


1346883 


132.0 


sl2.pk0045.bl0 


Narcissus pseudonarcissus " 


1709885 


176.0 



5 The sequence of the entire cDN A insert in clone csi 1 .pk0034.d8 was determined and a 

contig assembled with this sequence and a portion of the cDNA insert from clone 
p0008.cb31d95rb. The sequence of this contig is shown in SEQ ID NO: 1 ; the deduced 
amino acid sequence of this cDNA is shown in SEQ ID NO:2. The amino acid sequence set 
forth in SEQ ID NO:2 was evaluated by BLASTP, yielding a pLog value of 132.0 versus the 

10 Lycopersicon esculentum phytoene synthase 2 sequence (NCBI gi Accession No. 585747; 
SEQ ID NO:27). The sequence of the contig assembled of a portion of the cDNA insert 
from clones p0121.cfrmo87r, p009 1 xmarc67r and p0005.cbmej22r is shown in SEQ ID 
NO:3; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO:4. The 
sequence of the contig assembled of a portion of the cDNA insert from clones 

15 rdslc.pk005.15, rlr6.pk0028.g3 and rds2c.pk007.fl6 is shown in SEQ ID NO:5; the deduced 
amino acid sequence of this cDNA is shown in SEQ ID NO:6. The sequence of the contig 
asssembled of a portion of the cDNA insert from clones rl0n.pkl09.j7 and rl0n,pkl20.p4 is 
shown in SEQ ID NO:7; the deduced amino acid sequence of this cDNA is shown in SEQ 
ID NO:8. The sequence of the contig assembled from the entire cDNA insert in clone 

20 rl0.pk0005.e5 and a portion of the cDNA insert from clones rlmln.pkOOl .a4 and 

real n.pk00 1.18 is shown in SEQ ID NO:9; the deduced amino acid sequence of this cDNA is 
shown in SEQ ID NO: 10. The sequence of the entire cDNA insert in clone si 1 ,pk0029.h5 
was determined and is shown in SEQ ID NO:l 1; the deduced amino acid sequence of this 
cDNA is shown in SEQ ID NO: 12. The EST sequences from clones ssm.pkOOl l.d9, 

25 sll.pk0069.e4 and sll.pk0073.glO are included in the sequence from the entire cDNA insert 
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in clone sll.pk0029.h5. The amino acid sequence set forth in SEQ ID NO: 12 was evaluated 
by BLASTP, yielding a pLog value of 1 14.0 versus the Cucumis melo sequence (NCBI gi 
Accession No. 1346882). The sequence of the entire cDNA insert in clone sl2.pk0045.bl0 
was determined and is shown in SEQ ID NO: 13; the deduced amino acid sequence of this 
cDNA is shown in SEQ ID NO: 14. The EST sequences from clone si 1 .pk003 1 .b8 is 
included in the sequence of the entire cDNA insert from clone sl2.pk0045.bl0. The amino 
acid sequence set forth in SEQ ID NO: 14 was evaluated by BLASTP, yielding a pLog value 
of 153.0 versus the Cucumis melo sequence. The sequence of the entire cDNA insert in 
clone wrl.pk0139.g3 was determined and is shown in SEQ ID NO:15; the deduced amino 
acid sequence of this cDNA is shown in SEQ ID NO: 16. The amino acid sequence set forth 
in SEQ ID NO: 16 was evaluated by BLASTP, yielding a pLog value of 1 18.0 versus the 
Lycopersicon esculentum sequence. Figure 1 presents an alignment of the amino acid 
sequences set forth in SEQ ID NOs:2 and 14 with the Lycopersicon esculentum sequence 
(SEQ ID NO:27) and the Cucumis melo sequence (SEQ ID NO:28). The data in Table 5 
presents a calculation of the percent similarity of the amino acid sequences set forth in SEQ 
ID NOs:2 and 14 with the Lycopersicon esculentum sequence (SEQ ID NO:27) and the 
Cucumis melo sequence (SEQ ID NO:28). 

TABLE 5 

Percent Similarity of Amino Acid Sequences Deduced From the Nucleotide Sequences 

of cDNA Clones Encoding Polypeptides Homologous 
to Phytoene Synthase 

Percent Similarity to 

Clone SEQ ID NO. 1346882 585747 

Contigof: 2 57.0 78.1 

p0008.cb31d95rb 
csil.pk0034.d8 

Contigof: 4 70.4 74.2 

p0121.cfrmo87r 
p0091.cmarc67r 
p0005.cbmej22r 

Contig of: 6 47.6 32.3 

rdslc.pk005.15 
rlr6.pk0028.g3 
rds2c.pk007.fl6 

Contigof: 8 82.4 82.4 

rl0n.pkl09j7 
rl0n.pkl20.p4 
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Percent Similarity to 


Clone 


SEO ID NO. 


1346882 


585747 


Contig of: 


10 


77.0 


77.8 


rlmln.pk001.a4 








rcaln.pk001.18 








rl0.pk0005.e5 








sll.pk0029.h5 


12 


77.1 


78.7 


sl2.pk0045.bl0 


14 


66.8 


78.4 


wrl.pk0139.g3 


16 


78.7 


81.1 



Sequence alignments and percent similarity calculations were performed using the 
Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., 
5 Madison, WI). Multiple alignment of the amino acid sequences was performed using the 
Clustal method of alignment (Higgins, D.G. and Sharp, P.M. (1989) CABIOS. 5:151-153) 
with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). 

Sequence alignments and BLAST scores and probabilities indicate that the instant 
nucleic acid fragments encode entire or nearly entire corn and soybean phytoene synthase 
10 and portions of com, rice, soybean and wheat phytoene synthase isozymes. These sequences 
represent the first rice, soybean and wheat sequences encoding phytoene synthase, an entire 
corn variant which is 55.7% similar to the corn sequences available in the art (NCBI gi 
Accession Nos. 1346883 and 1098665) and a portion of a corn variant which is 72.0% 
similar to the art sequences. 
15 EXAMPLE 4 

Characterization of cDNA Clones Encoding Zeaxanthin Epoxidase 
The BLASTX search using the nucleotide sequences from clones cbn2.pk0051.e8 and 
crln.pk0033.d8, and the EST sequences from clone sll.pk0015.c4 revealed similarity of the 
proteins encoded by the cDNAs to Zeaxanthin Epoxidase from Lycopersicon esculentum 
20 and Nicotiana plumbaginifolia (GenBank Accession Nos. Z83835 and X95732, 

respectively). The BLAST results for each of these sequences are shown in Table 6: 

TABLE 6 

BLASTn Results for Clones Encoding Polypeptides Homologous 
25 to Zeaxanthin Epoxidase 







GenBank 


BLAST 


Clone 


Organism 


Accession No. 


pLog Score 


cbn2.pk0051.e8 


Lycopersicon esculentum 


Z83835 


45.52 


crln.pk0033.d8 


Nicotiana plumbaginifolia 


X95732 


65.70 


sll.pk0015.c4 


Lycopersicon esculentum 


Z83835 


8.30 
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TBLASTN analysis of the proprietary plant EST database indicated that another 
soybean clone besides sll.pk0015.c4 also encoded zeaxanthin epoxidase. The BLASTX 
search using the EST sequences from the 5'terminal and 3'terminal portions of the cDNA 
insert in clone sl2.pk0109.b6 revealed similarity of the proteins encoded by the cDNAs to 

5 zeaxanthin epoxidase from Prunus armeniaca (NCBI gi Accession No. 3264757), with 
pLog values of >254 and 41.70, respectively. 

The sequence of the entire cDNA insert in clone cbn2.pk005 1 .e8 was determined and 
a contig assembled with this sequence and a portion of the cDNA insert from clones 
p003 1 xcmaj44r and p0097.cqrag63r. The nucleotide sequence of this contig is shown in 

10 SEQ ID NO: 1 7; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 1 8. 
The sequence of the entire cDNA insert in clone crln.pk0033.d8 was determined and a 
contig assembled with this sequence and a portion of the cDNA insert from clones 
pOl lO.cgsmpOlr, p0012.cglae05r and p0088.clrim55r. The nucleotide sequence of this 
contig is shown in SEQ ID NO: 19; the deduced amino acid sequence of this cDNA is shown 

15 in SEQ ID NO:20. The sequence of the entire cDNA insert in clone si 1 .pkOOl 5x4 was 

determined and is shown in SEQ ID NO:21 ; the deduced amino acid sequence of this cDNA 
is shown in SEQ ID NO:22. The sequence of the 5' terminus of Jhe cDNA insert in clone 
sl2.pk0109.b6 was determined and is shown in SEQ ID NO:23; the deduced amino acid 
sequence of this cDNA is shown in SEQ ID NO:24. The sequence of the 3 f terminus of the 

20 cDNA insert in clone sl2.pk0109.b6 was determined and is shown in SEQ ID NO:25; the 
deduced amino acid sequence of this cDNA is shown in SEQ ID NO:26. 

The data in Table 7 presents a calculation of the percent similarity of the amino acid 
sequences set forth in SEQ ID NOs:18, 20, 22, 24 and 26 and the Lycopersicon csculentum 
and Prunus armeniaca sequences. 

25 

TABLE 7 



Percent Similarity of Amino Acid Sequences Deduced From the Nucleotide Sequences of 
cDNA Clones Encoding Polypeptides Homologous to Zeaxanthin Epoxidase 



Clone 


SEQ ID NO. 


Percent Identity to 
1772985 3264757 


Contig of: 

cbn2.pk0051.e8 
p0031.ccmaj44r 
p0097-cqrag63r 


18 


55.1 


56.6 


Contig of: 

pOUO.cgsmpOlr 
p0012.cglae05r 
p0088.clrim55r 
crln.pk0033.d8 


20 


66.5 


64.9 


sll.pk0015.c4 


22 


51.9 


51.9 


5'endofsl2.pk0109.b6 


24 


66.1 


72.7 


3'end of sl2.pk0109.b6 


26 
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Sequence alignments and percent similarity calculations were performed using the 
Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., 
Madison, WI). Multiple alignment of the amino acid sequences was performed using the 
5 Clustal method of alignment (Higgins, D. G. and Sharp, P. M. (1989) CABIOS. 5:151-153) 
with the default parameters (GAP PEN ALT Y= 10, GAP LENGTH PENALTY- 10). 

Sequence alignments and BLAST scores and probabilities indicate that the instant 
nucleic acid fragments encode entire or nearly entire soybean zeaxanthin epoxidase and 
portions of corn and soybean zeaxanthin epoxidase isozymes. These sequences represent 
10 the first corn and soybean sequences encoding zeaxanthin epoxidase. 

EXAMPLE 5 
Expression of Chimeric Genes in Monocot Cells 
A chimeric gene comprising a cDNA encoding a carotenoid biosynthetic enzyme in 
sense orientation with respect to the maize 27 kD zein promoter that is located 5' to the 
15 cDNA fragment, and the 10 kD zein 3' end that is located 3' to the cDNA fragment, can be 
constructed. The cDNA fragment of this gene may be generated by polymerase chain 
reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites 
(Nco I or Sma I) can be incorporated into the oligonucleotides to provide proper orientation 
of the DNA fragment when inserted into the digested vector pML103 as described below. 
20 Amplification is then performed in a standard PCR. The amplified DNA is then digested 
with restriction enzymes Nco I and Smal and fractionated on an agarose gel. The 
appropriate band can be isolated from the gel and combined with a 4.9 kb Nco I-Sma I 
fragment of the plasmid pML103. Plasmid pML103 has been deposited under the terms of 
the Budapest Treaty at ATCC (American Type Culture Collection, 10801 University Blvd., 
25 Manassas, VA 20 11 0-2209), and bears accession number ATCC 97366. The DNA segment 
from pML103 contains a 1.05 kb Sal I-Nco I promoter fragment of the maize 27 kD zein 
gene and a 0.96 kb Sma I-Sal I fragment from the 3' end of the maize 10 kD zein gene in the 
vector pGem9Zf(+) (Promega). Vector and insert DNA can be ligated at 15°C overnight, 
essentially as described (Maniatis). The ligated DNA may then be used to transform E. coli 
30 XLl-Blue (Epicurian Coli XL-1 Blue™; Stratagene). Bacterial transformants can be 

screened by restriction enzyme digestion of plasmid DNA and limited nucleotide sequence 
analysis using the dideoxy chain termination method (Sequenase™ DNA Sequencing Kit; 
U.S. Biochemical). The resulting plasmid construct would comprise a chimeric gene 
encoding, in the 5 f to 3* direction, the maize 27 kD zein promoter, a cDN A fragment 
35 encoding a carotenoid biosynthetic enzyme, and the 10 kD zein 3' region. 

The chimeric gene described above can then be introduced into corn cells by the 
following procedure. Immature corn embryos can be dissected from developing caryopses 
derived from crosses of the inbred corn lines H99 and LH132. The embryos are isolated 10 
to 1 1 days after pollination when they are 1 .0 to 1.5 mm long. The embryos are then placed 
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with the axis-side facing down and in contact with agarose-solidified N6 medium (Chu 
et al., (1975) Sci. Sin. Peking 18:659-668). The embryos are kept in the dark at 27°C. 
Friable embryogenic callus consisting of undifferentiated masses of cells with somatic 
proembryoids and embryoids borne on suspensor structures proliferates from the scutellum 

5 of these immature embryos. The embryogenic callus isolated from the primary explant can 
be cultured on N6 medium and sub-cultured on this medium every 2 to 3 weeks. 

The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst Ag, Frankfurt, 
Germany) may be used in transformation experiments in order to provide for a selectable 
marker. This plasmid contains the Pat gene (see European Patent Publication 0 242 236) 

10 which encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT confers 

resistance to herbicidal glutamine synthetase inhibitors such as phosphinothricin. The pat 
gene in p35S/Ac is under the control of the 35S promoter from Cauliflower Mosaic Virus 
(Odell et al. (1985) Nature 313:810-812) and the 3' region of the nopaline synthase gene 
from the T-DN A of the Ti plasmid of Agrobacterium tumefaciens. 

15 The particle bombardment method (Klein T. M. et al., ( 1 987) Nature 327:70-73) may 

be used to transfer genes to the callus culture cells. According to this method, gold particles 
(1 nm in diameter) are coated with DNA using the following technique. Ten |ig of plasmid 
DNAs are added to 50 pL of a suspension of gold particles (60 mg per mL). Calcium 
chloride (50 \xL of a 2.5 M solution) and spermidine free base (20 \xL of a 1 .0 M solution) 

20 are added to the particles. The suspension is vortexed during the addition of these solutions. 
After 10 minutes, the tubes are briefly centrifiiged (5 sec at 15,000 rpm) and the supernatant 
removed. The particles are resuspended in 200 |iL of absolute ethanol, centrifuged again 
and the supernatant removed. The ethanol rinse is performed again and the particles 
resuspended in a final volume of 30 \iL of ethanol. An aliquot (5 jiL) of the DNA-coated 

25 gold particles can be placed in the center of a Kapton™ flying disc (Bio-Rad Labs). The 
particles are then accelerated into the corn tissue with a Biolistic™ PDS-1000/He (Bio-Rad 
Instruments, Hercules CA), using a helium pressure of 1000 psi, a gap distance of 0.5 cm 
and a flying distance of 1 .0 cm. 

For bombardment, the embryogenic tissue is placed on filter paper over agarose- 

30 solidified N6 medium. The tissue is arranged as a thin lawn and covered a circular area of 
about 5 cm in diameter. The petri dish containing the tissue can be placed in the chamber of 
the PDS-1000/He approximately 8 cm from the stopping screen. The air in the chamber is 
then evacuated to a vacuum of 28 inches of Hg. The macrocarrier is accelerated with a 
helium shock wave using a rupture membrane that bursts when the He pressure in the shock 

35 tube reaches 1000 psi. 

Seven days after bombardment the tissue can be transferred to N6 medium that 
contains gluphosinate (2 mg per liter) and lacks casein or proline. The tissue continues to 
grow slowly on this medium. After an additional 2 weeks the tissue can be transferred to 
fresh N6 medium containing gluphosinate. After 6 weeks, areas of about I cm in diameter 
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of actively growing callus can be identified on some of the plates containing the glufosinate- 
supplemented medium. These calli may continue to grow when sub-cultured on the 
selective medium. 

Plants can be regenerated from the transgenic callus by first transferring clusters of 
5 tissue to N6 medium supplemented with 0.2 mg per liter of 2,4-D. After two weeks the 
tissue can be transferred to regeneration medium (Fromm et al., (1990) Bio/Technology 
5:833-839). 

EXAMPLE 6 
Expression of Chimeric Genes in Dicot Cells 

10 A seed-specific expression cassette composed of the promoter and transcription 

terminator from the gene encoding the p subunit of the seed storage protein phaseolin from 
the bean Phaseolus vulgaris (Doyle et al. (1986) 1 Biol Chem. 26\ :9228-9238) can be used 
for expression of the instant carotenoid biosynthetic enzyme in transformed soybean. The 
phaseolin cassette includes about 500 nucleotides upstream (5') from the translation initiation 

15 codon and about 1650 nucleotides downstream (3') from the translation stop codon of 

phaseolin. Between the 5' and 3 1 regions are the unique restriction endonuclease sites Nco I 
(which includes the ATG translation initiation codon), Sma I, Kpn I and Xba L The entire 
cassette is flanked by Hind III sites. 

The cDNA fragment of this gene may be generated by polymerase chain reaction 

20 (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites can be 
incorporated into the oligonucleotides to provide proper orientation of the DN A fragment 
when inserted into the expression vector. Amplification is then performed as described 
above, and the isolated fragment is inserted into a pUC 1 8 vector carrying the seed 
expression cassette. 

25 Soybean embroys may then be transformed with the expression vector comprising 

sequences encoding a carotenoid biosynthetic enzyme. To induce somatic embryos, 
cotyledons, 3-5 mm in length dissected from surface sterilized, immature seeds of the 
soybean cultivar A2872, can be cultured in the light or dark at 26°C on an appropriate agar 
medium for 6- 1 0 weeks. Somatic embryos which produce secondary embryos are then 

30 excised and placed into a suitable liquid medium. After repeated selection for clusters of 
somatic embryos which multiplied as early, globular staged embryos, the suspensions are 
maintained as described below. 

Soybean embryogenic suspension cultures can maintained in 35 mL liquid media on a 
rotary shaker, 150 rpm, at 26°C with florescent lights on a 16:8 hour day/night schedule. 

35 Cultures are subcultured every two weeks by inoculating approximately 35 mg of tissue into 
35 mL of liquid medium. 

Soybean embryogenic suspension cultures may then be transformed by the method of 
particle gun bombardment (Klein T. M. et al. (1987) Nature (London) 527:70-73, U.S. 
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Patent No. 4,945,050). A DuPont Biolistic™ PDS1000/HE instrument (helium retrofit) can 
be used for these transformations. 

A selectable marker gene which can be used to facilitate soybean transformation is a 
chimeric gene composed of the 35S promoter from Cauliflower Mosaic Virus (Odell et al. 

5 (1985) Nature 5/3:810-812), the hygromycin phosphotransferase gene from plasmid pJR225 
(from £. coli\ Gritz et al.(1983) Gene 25:179-188) and the 3' region of the nopaline synthase 
gene from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens. The seed expression 
cassette comprising the phaseolin 5 % region, the fragment encoding the carotenoid 
biosynthetic enzyme and the phaseolin 3' region can be isolated as a restriction fragment. 

10 This fragment can then be inserted into a unique restriction site of the vector carrying the 
marker gene. 

To 50 |iL of a 60 mg/mL 1 \im gold particle suspension is added (in order): 5 \iL 
DNA (1 |ag4iL), 20 |il spermidine (0.1 M), and 50 jiL CaCl 2 (2.5 M). The particle 
preparation is then agitated for three minutes, spun in a microfuge for 1 0 seconds and the 

15 supernatant removed. The DNA-coated particles are then washed once in 400 \iL 70% 

ethanol and resuspended in 40 of anhydrous ethanol. The DNA/particle suspension can 
be sonicated three times for one second each. Five jiL of the DNA-coated gold particles are 
then loaded on each macro carrier disk. 

Approximately 300-400 mg of a two-week-old suspension culture is placed in an 

20 empty 60x15 mm petri dish and the residual liquid removed from the tissue with a pipette. 
For each transformation experiment, approximately 5-10 plates of tissue are normally 
bombarded. Membrane rupture pressure is set at 1 100 psi and the chamber is evacuated to a 
vacuum of 28 inches mercury. The tissue is placed approximately 3.5 inches away from the 
retaining screen and bombarded three times. Following bombardment, the tissue can be 

25 divided in half and placed back into liquid and cultured as described above. 

Five to seven days post bombardment, the liquid media may be exchanged with fresh 
media, and eleven to twelve days post bombardment with fresh media containing 50 mg/mL 
hygromycin. This selective media can be refreshed weekly. Seven to eight weeks post 
bombardment, green, transformed tissue may be observed growing from untransformed, 

30 necrotic embryogenic clusters. Isolated green tissue is removed and inoculated into 

individual flasks to generate new, clonally propagated, transformed embryogenic suspension 
cultures. Each new line may be treated as an independent transformation event. These 
suspensions can then be subcultured and maintained as clusters of immature embryos or 
regenerated into whole plants by maturation and germination of individual somatic embryos. 

35 EXAMPLE 7 

Expression of Chimeric Genes in Microbial Cells 
The cDNAs encoding the instant carotenoid biosynthetic enzymes can be inserted into 
the T7 E. coli expression vector pBT430. This vector is a derivative of pET-3a (Rosenberg 
et al. (1987) Gene 56:125-135) which employs the bacteriophage T7 RNA polymerase/T7 
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promoter system. Plasmid pBT430 was constructed by first destroying the EcoR I and 
Hind III sites in pET-3a at their original positions. An oligonucleotide adaptor containing 
EcoR I and Hind III sites was inserted at the BamH I site of pET-3a. This created pETOaM 
with additional unique cloning sites for insertion of genes into the expression vector. Then, 

5 the Nde I site at the position of translation initiation was converted to an Nco I site using 
oligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM in this region, 
S'-CATATGG, was converted to 5*-CCCATGG in pBT430. 

Plasmid DNA containing a cDNA may be appropriately digested to release a nucleic 
acid fragment encoding the protein. This fragment may then be purified on a 1% NuSieve 

10 GTG™ low melting agarose gel (FMC). Buffer and agarose contain 10 ng/ml ethidium 

bromide for visualization of the DNA fragment. The fragment can then be purified from the 
agarose gel by digestion with GELase™ (Epicentre Technologies) according to the 
manufacturer's instructions, ethanol precipitated, dried and resuspended in 20 uL of water. 
Appropriate oligonucleotide adapters may be ligated to the fragment using T4 DNA ligase 

15 (New England Biolabs, Beverly, MA). The fragment containing the ligated adapters can be 
purified from the excess adapters using low melting agarose as described above. The vector 
pBT430 is digested, dephosphorylated with alkaline phosphatase (NEB) and deproteinized 
with phenol/chloroform as described above. The prepared vector pBT430 and fragment can 
then be ligated at 16°C for 15 hours followed by transformation into DH5 electrocompetent 

20 cells (GIBCO BRL). Transformants can be selected on agar plates containing LB media and 
1 00 jag/mL ampicillin. Transformants containing the gene encoding the carotenoid 
biosynthetic enzyme are then screened for the correct orientation with respect to the T7 
promoter by restriction enzyme analysis. 

For high level expression, a plasmid clone with the cDNA insert in the correct 

25 orientation relative to the T7 promoter can be transformed into E. coli strain BL2 1 (DE3) 
(Studier et al. (1986) J. Mol Biol 759:1 13-130). Cultures are grown in LB medium 
containing ampicillin (100 mg/L) at 25°C. At an optical density at 600 nm of approximately 
1, IPTG (isopropylthio-|}-galactoside, the inducer) can be added to a final concentration of 
0.4 mM and incubation can be continued for 3 h at 25°. Cells are then harvested by 

30 centrifiigation and re-suspended in 50 ^L of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM 
DTT and 0.2 mM phenyl methylsulfonyl fluoride. A small amount of 1 mm glass beads can 
be added and the mixture sonicated 3 times for about 5 seconds each time with a microprobe 
sonicator. The mixture is centrifuged arid the protein concentration of the supernatant 
determined. One |ig of protein from the soluble fraction of the culture can be separated by 

35 SDS-poiyacrylamide gel electrophoresis. Gels can be observed for protein bands migrating 
at the expected molecular weight. 
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EXAMPLE 8 

Evaluating Compounds for Their Ability to Inhibit the Activity 

of Carotenoid Biosynthetic Enzymes 
The carotenoid biosynthetic enzymes described herein may be produced using any 

5 number of methods known to those skilled in the art. Such methods include, but are not 
limited to, expression in bacteria as described in Example 7, or expression in eukaryotic cell 
culture, in planta, and using viral expression systems in suitably infected organisms or cell 
lines. The instant carotenoid biosynthetic enzymes may be expressed either as mature forms 
of the proteins as observed in vivo or as fusion proteins by covalent attachment to a variety 

10 of enzymes, proteins or affinity tags. Common fusion protein partners include glutathione 
S-transferase ("GST")> thioredoxin ("Trx"), maltose binding protein, and C- and/or 
N-terminal hexahistidine polypeptide ("(His) 6 "). The fusion proteins may be engineered 
with a protease recognition site at the fusion point so that fusion partners can be separated by 
protease digestion to yield intact mature enzyme. Examples of such proteases include 

15 thrombin, enterokinase and factor Xa. However, any protease can be used which specifically 
cleaves the peptide connecting the fusion protein and the enzyme. 

Purification of the instant carotenoid biosynthetic enzymes, if desired, may utilize any 
number of separation technologies familiar to those skilled in the art of protein purification. 
Examples of such methods include, but are not limited to, homogenization, filtration, 

20 centrifugation, heat denaturation, ammonium sulfate precipitation, desalting, pH 

precipitation, ion exchange chromatography, hydrophobic interaction chromatography and 
affinity chromatography, wherein the affinity ligand represents a substrate, substrate analog 
or inhibitor. When the carotenoid biosynthetic enzymes are expressed as fusion proteins, the 
purification protocol may include the use of an affinity resin which is specific for the fusion 

25 protein tag attached to the expressed enzyme or an affinity resin containing ligands which 
are specific for the enzyme. For example, a carotenoid biosynthetic enzyme may be 
expressed as a fusion protein coupled to the C-terminus of thioredoxin. In addition, a (His)5 
peptide may be engineered into the N-terminus of the fused thioredoxin moiety to afford 
additional opportunities for affinity purification. Other suitable affinity resins could be 

30 synthesized by linking the appropriate ligands to any suitable resin such as Sepharose-4B. In 
an alternate embodiment, a thioredoxin fusion protein may be eluted using dithiothreitol; 
however, elution may be accomplished using other reagents which interact to displace the 
thioredoxin from the resin. These reagents include p-mercaptoethanol or other reduced 
thiol The eluted fusion protein may be subjected to further purification by traditional means 

35 as stated above, if desired. Proteolytic cleavage of the thioredoxin fusion protein and the 
enzyme may be accomplished after the fusion protein is purified or while the protein is still 
bound to the ThioBond™ affinity resin or other resin. 

Crude, partially purified or purified enzyme, either alone or as a fusion protein, may be 
utilized in assays for the evaluation of compounds for their ability to inhibit enzymatic 
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activation of the carotenoid biosynthetic enzymes disclosed herein. Assays may be 
conducted under well known experimental conditions which permit optimal enzymatic 
activity. For example, assays for phytoene synthase are presented by Neudert U. et al. 
(1998) Biochim. Biophys. Acta 7592:51-58. Assays for zeaxanthin epoxidase are presented 
5 by Bouvier F. et al. (1996) J. Biol. Chem. 277:28861-28867). 
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CLAIMS 

What is claimed is: 

1 . An isolated nucleic acid fragment encoding all or a substantial portion of a 
phytoene synthase comprising a member selected from the group consisting of: 

5 (a) an isolated nucleic acid fragment encoding all or a substantial portion of 

the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID 
NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14 and SEQ ID 
NO: 16; 

10 (b) an isolated nucleic acid fragment that is substantially similar to an 

isolated nucleic acid fragment encoding all or a substantial portion of 
the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID 
NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO: 14 and SEQ ID 

15 NO: 16; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

2. The isolated nucleic acid fragment of Claim 1 wherein the nucleotide sequence 
of the fragment comprises all or a portion of the sequence set forth in a member selected 
from the group consisting of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, 

20 SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO: 13 and SEQ ID NO: 15. 

3. A chimeric gene comprising the nucleic acid fragment of Claim 1 operably 
linked to suitable regulatory sequences. 

4. A transformed host cell comprising the chimeric gene of Claim 3. 

5. A phytoene synthase polypeptide comprising all or a substantial portion of the 
25 amino acid sequence set forth in a member selected from the group consisting of SEQ ID 

NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ 
ID NO: 1 4 and SEQ ID NO: 1 6. 

6. An isolated nucleic acid fragment encoding all or a substantial portion of a 
zeaxanthin epoxidase comprising a member selected from the group consisting of: 

30 (a) an isolated nucleic acid fragment encoding all or a substantial portion of 

the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID 
NO:24 and SEQ ID NO:26; 

(b) an isolated nucleic acid fragment that is substantially similar to an 

35 isolated nucleic acid fragment encoding all or a substantial portion of 

the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO: 1 8, SEQ ID NO:20, SEQ ID NO:22, SEQ ID 
NO:24 and SEQ ID NO:26; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 
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7. The isolated nucleic acid fragment of Claim 6 wherein the nucleotide sequence 
of the fragment comprises all or a portion of the sequence set forth in a member selected 
from the group consisting of SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID 
NO:23 and SEQ IDNO:25. 

8. A chimeric gene comprising the nucleic acid fragment of Claim 6 operably 
linked to suitable regulatory sequences. 

9. A transformed host cell comprising the chimeric gene of Claim 8. 

10. A zeaxanthin epoxidase polypeptide comprising all or a substantial portion of 
the amino acid sequence set forth in a member selected from the group consisting of SEQ ID 
NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24 and SEQ ID NO:26. 

11. A method of altering the level of expression of a carotenoid biosynthetic 
enzyme in a host cell comprising: 

(a) transforming a host cell with the chimeric gene of any of Claims 3 and 

8; and 

(b) growing the transformed host cell produced in step (a) under conditions 
that are suitable for expression of the chimeric gene 

wherein expression of the chimeric gene results in production of altered levels of a 
carotenoid biosynthetic enzyme in the transformed host cell. 

12. A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a carotenoid biosynthetic enzyme comprising: 

(a) probing a cDNA or genomic library with the nucleic acid fragment of 
any of Claims 1 and 6; 

(b) identifying a DNA clone that hybridizes with the nucleic acid fragment 
of any of Claims 1 and 6; 

(c) isolating the DNA clone identified in step (b); and 

(d) sequencing the cDNA or genomic fragment that comprises the clone 
isolated in step (c) 

wherein the sequenced nucleic acid fragment encodes all or a substantial portion of the 
amino acid sequence encoding a carotenoid biosynthetic enzyme. 

13. A method of obtaining a nucleic acid fragment encoding a substantial portion 
of an amino acid sequence encoding a carotenoid biosynthetic enzyme comprising: 

(a) synthesizing an oligonucleotide primer corresponding to a portion of 
the sequence set forth in any of SEQ ID NOs:l, 3, 5, 7, 9, 1 1, 13, 15, 
17, 19, 21, 23 and 25; and 

(b) amplifying a cDNA insert present in a cloning vector using the 
oligonucleotide primer of step (a) and a primer representing sequences 
of the cloning vector 

wherein the amplified nucleic acid fragment encodes a substantial portion of an amino acid 
sequence encoding a carotenoid biosynthetic enzyme. 
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14. The product of the method of Claim 12. 

1 5 . The product of the method of Claim 1 3 . 

16. A method for evaluating at least one compound for its ability to inhibit the 
activity of a carotenoid biosynthetic enzyme, the method comprising the steps of: 

(a) transforming a host cell with a chimeric gene comprising a nucleic acid 
fragment encoding a carotenoid biosynthetic enzyme, operably linked 
to suitable regulatory sequences; 

(b) growing the transformed host cell under conditions that are suitable for 
expression of the chimeric gene wherein expression of the chimeric 
gene results in production of the carotenoid biosynthetic enzyme 
encoded by the operably linked nucleic acid fragment in the 
transformed host cell; 

(c) optionally purifying the carotenoid biosynthetic enzyme expressed by 
the transformed host cell; 

(d) treating the carotenoid biosynthetic enzyme with a compound to be 
tested; and 

(e) comparing the activity of the carotenoid biosynthetic enzyme that has 
been treated with a test compound to the activity of an untreated 
carotenoid biosynthetic enzyme, 

thereby selecting compounds with potential for inhibitory activity. 
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<110> E . I. DU PONT DE NEMOURS AND COMPANY 

<120> CAROTENOID BIOSYNTHESIS ENZYMES 

<130> BB-1115-B 

<140> 
<141> 

<150> 60/083,042 

<151> APRIL 24, 1998 

<160> 28 

<170> MICROSOFT OFFICE 97 

<210> 1 

<211> 1448 

<212> DNA 

<213> Zea mays 



<400> 1 

cggaggaaga ggaggaggag agggtcctcg 
gctgcggcga ggtctgcgcc gagtacgcca 
ctcctgagcg gcgcaaagcc gtctgggcga 
tagtggacgg tcccaacgcg tcctacatca 
ggctggagga tctctttgag ggccgcccgt 
ccgtgtccaa gttccccgtc gatatccagc 
tggacctgtg gaagtcgagg tacatgacct 
tcgccggcac gcagctcatg actcctgagc 
ggtgcagaag aactgacgag ctagtggacg 
ctctcgaccg ctgggagaag cggctggagg 
acgacgccgc gctctcggac actgtgtcca 
acatggtcca aggaatgagg ctggacctgt 
tctacctcta ctgctactac gtcgccggca 
gcatcgctcc cgactccaag gcctcgaccg 
gcatcgctaa ccagctgacg aatattctca 
gaatatacct tccgttggac gagcttgcgc 
gagggaaagt gaccggcaag tggaggaggt 
tcttctttga tgaggcggag aagggcgtca 
tgctcgcgtc tctgtggctg tacaggcaga 
acaacttcac caagcgtgcg tacgtcggca 
catatgcaag ggctgcggtt gcaccatgaa 
tcttttccaa acccaccttg ttttgcccca 
ttcagctgcc tgcatggcat aagccttgcc 
tcaatcagct cttgttacaa ggaatggaga 
aaaaaaaa 



gctggggcct cctcggcgac gcctacgacc 60 
agacctttta cctcggcacg cagctcatga 120 
tctacgtgtg gtgcagaaga actgacgagc 180 
cgccgaccgc tctcgaccgc tgggagaagc 240 
acgacatgta cgacgccgcg ctctcggaca 300 
cgttcaaaga catggtccaa ggaatgaggc 360 
tcgacgagct ctacctctac tgctactacg 420 
ggcgcaaagc cgtctgggcg atctacgtgt 480 
gtcccaacgc gtcctacatc acgccgaccg 540 
atctctttga gggccgcccg tacgacatgt 600 
agttccccgt cgatatccag ccgttcaaag 660 
ggaagtcgag gtacatgacc ttcgacgagc 720 
ccgtcggcct catgacggtg cctgtcatgg 780 
agagcgtgta caatgctgct ctggctctcg 840 
gagacgtggg cgaagatgcg aggaggggga 900 
aggcaggtct cacggaagag gacatattca 960 
tcatgaaggg ccagatccag cgtgccaggc 1020 
cccatctcga ctctgctagc agatggccgg 1080 
tccttgatgc cattgaggca aacgactaca 1140 
aggccaagaa gctgctgtcg ttaccgcttg 1200 
ccatccgtag atcacatctt ttttttcttt 1260 
cccttccttt tttttttgta tataatcagc 1320 
tgttcagggt gattccatgt ccctaaatac 1380 
attagaattc gagaagcgta aaaaaaaaaa 14 40 

1448 



<210> 2 

<211> 408 

<212> PRT 

<213> Zea mays 

<400> 2 

Glu Glu Glu Glu Glu Glu Arg Val Leu Gly Trp Gly Leu Leu Gly Asp 
15 10 15 

Ala Tyr Asp Arg Cys Gly Glu Val Cys Ala Glu Tyr Ala Lys Thr Phe 

20 25 30 
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Tyr Leu Gly Thr Gin Leu Met Thr Pro Glu Arg Arg Lys Ala Val Trp 
35 40 45 

Ala He Tyr Val Trp Cys Arg Arg Thr Asp Glu Leu Val Asp Gly Pro 
50 .55 .60 

Asn Ala Ser Tyr He Thr Pro Thr Ala Leu Asp Arg Trp Glu Lys Arg 
65 70 75 80 

Leu Glu Asp Leu Phe Glu Gly Arg Pro Tyr Asp Met Tyr Asp Ala Ala 

85 90 95 

Leu Ser Asp Thr Val Ser Lys Phe Pro Val Asp He Gin Pro Phe Lys 

100 105 HO 

Asp Met Val Gin Gly Met Arg Leu Asp Leu Trp Lys Ser Arg Tyr Met 
115 120 125 

Thr Phe Asp Glu Leu Tyr Leu Tyr Cys Tyr Tyr Val Ala Gly Thr Gin 
130 135 140 

Leu Met Thr Pro Glu Arg Arg Lys Ala Val Trp Ala He Tyr Val Trp 
145 150 155 160 

Cys Arg Arg Thr Asp Glu Leu Val Asp Gly Pro Asn Ala Ser Tyr He 

165 170 175 

Thr Pro Thr Ala Leu Asp Arg Trp Glu Lys Arg Leu Glu Asp Leu Phe 

180 185 190 

Glu Gly Arg Pro Tyr Asp Met Tyr Asp Ala Ala Leu Ser Asp Thr Val 
195 200 205 

Ser Lys Phe Pro Val Asp He Gin Pro Phe Lys Asp Met Val Gin Gly 
210 215 220 

Met Arg Leu Asp Leu Trp Lys Ser Arg Tyr Met Thr Phe Asp Glu Leu 
225 230 235 240 

Tyr Leu Tyr Cys Tyr Tyr Val Ala Gly Thr Val Gly Leu Met Thr Val 

245 250 255 

Pro Val Met Gly He Ala Pro Asp Ser Lys Ala Ser Thr Glu Ser Val 

260 265 270 

Tyr Asn Ala Ala Leu Ala Leu Gly He Ala Asn Gin Leu Thr Asn He 
275 280 285 

Leu Arg Asp Val Gly Glu Asp Ala Arg Arg Gly Arg He Tyr Leu Pro 
290 295 300 

Leu Asp Glu Leu Ala Gin Ala Gly Leu Thr Glu Glu Asp He Phe Arg 
305 310 315 320 

Gly Lys Val Thr Gly Lys Trp Arg Arg Phe Met Lys Gly Gin He Gin 

325 330 335 

Arg Ala Arg Leu Phe Phe Asp Glu Ala Glu Lys Gly Val Thr His Leu 

340 345 350 
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Asp Ser Ala Ser Arg Trp Pro Val Leu Ala Ser Leu Trp Leu Tyr Arg 
355 360 365 

Gin He Leu Asp Ala He Glu Ala Asn Asp Tyr Asn Asn Phe Thr Lys 
370 375 380 

Arg Ala Tyr Val Gly Lys Ala Lys Lys Leu Leu Ser Leu Pro Leu Ala 
385 390 395 400 

Tyr Ala Arg Ala Ala Val Ala Pro 

405 

<210> 3 
<211> 888 
<212> DNA 
<213> Zea mays 

<220> 

<221> unsure 
<222> (5) 

<220> 

<221> unsure 
<222> (10) 

<220> 

<221> unsure 
<222> (18) 

<220> 

<221> unsure 
<222> (225) 

<220> 

<221> unsure 
<222> (725) 

<220> 

<221> unsure 
<222> (809) 

<220> 

<221> unsure 
<222> (836) 

<220> 

<221> unsure 
<222> (862) 

<400> 3 

ggaangggtn gatacagntt gtatggcttg acggttgacg ataatgacgc tctgagaata 60 

ccagagcgga tttaagtttc taaactaacg ctaggacggt gaaagtggta gatacagttt 120 

gtatggcttg acggttgacg ataatgacga gggaagggat gacactgatt gatcgctgac 180 

gtgggtgttc tatctccgcg cacgcgcgct cctgttcagt gtggngcagg agaacggacg 240 

agctcgtgga cggccccaac gcgtcccaca tctcggcgct ggcgctggac cggtgggagt 300 

cgcggctgga ggacatcttc gccggccggc cgtacgacat gctcgacgcc gccctgtccg 360 

acaccgtcgc caggttcccc gtcgacatcc agccgttcag ggacatgatc gaggggatgc 4 20 

gcatggacct gaagaagtcc cggtacagga gcttcgacga gctgtacctc tactgctact 480 

acgtggccgg caccgtgggg ctgatgagcg tcccggtgat gggcatctcg ccggcgtcca 540 

gggcggccac cgagacggtg tacaaggggg cgctggcgct gggcctggcg aaccagctca 600 

ccaacatcct cagggacgtc ggcgaggacg ccaggagggg acggatctac ctcccgcaag 660 
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acgagctgga gatggcgggg ctctccgacg ccgaacgtcc tggacgggcc gcgtcaacga 720 

acgantggaa gggcttcatg aagggccaga ttcgcgaagg ccaaaacctt cttcaaggca 780 

agccggaagg aaagcgccaa cgaagctcna accaaggaga gccgattgcc ggtgtngtct 840 

tctctgctcc ttgtaccggc anatcctcga acgaaatcga aggccaac »«» 



<210> 4 

<211> 186 

<212> PRT 

<213> Zea mays 



<220> 

<221> UNSURE 
<222> (3) 



<220> 

<221> UNSURE 
<222> (169) 



<400> 4 

v a i Trp Xaa Arg Arg Thr Asp Glu Leu Val Asp Gly Pro Asn Ala Ser 
1 5 10 15 

His He Ser Ala Leu Ala Leu Asp Arg Trp Glu Ser Arg Leu Glu Asp 

20 25 30 

He Phe Ala Gly Arg Pro Tyr Asp Met Leu Asp Ala Ala Leu Ser Asp 
35 40 45 

Thr Val Ala Arg Phe Pro Val Asp He Gin Pro Phe Arg Asp Met He 
50 55 60 

Glu Gly Met Arg Met Asp Leu Lys Lys Ser Arg Tyr Arg Ser Phe Asp 
65 70 75 80 

Glu Leu Tyr Leu Tyr Cys Tyr Tyr Val Ala Gly Thr Val Gly Leu Met 

85 90 95 

Ser Val Pro Val Met Gly He Ser Pro Ala Ser Arg Ala Ala Thr Glu 

100 105 HO 

Thr Val Tyr Lys Gly Ala Leu Ala Leu Gly Leu Ala Asn Gin Leu Thr 
115 120 125 

Asn He Leu Arg Asp Val Gly Glu Asp Ala Arg Arg Gly Arg He Tyr 
130 135 140 

Leu Pro Gin Asp Glu Leu Glu Met Ala Gly Leu Ser Asp Ala Glu Arg 
145 150 155 160 

Pro Gly Arg Ala Ala Ser Thr Asn Xaa Trp Lys Gly. Phe Met Lys Gly 

165 170 175 

Gin lie Arg Glu Gly Gin Asn Leu Leu Gin 

180 185 

<210> 5 

<211> 766 

<212> DNA 

<213> Oryza sativa 
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<220> 

<221> unsure 
<222> (658) 

<400> 5 

cgcagactct cgactttgtc actagcatca ttgcttgatg atcgatgctg agctgcaacc 60 

aagcaccagc atatcctttc cttcattcct tcctggtgct ggtagaagaa gaacaagcta 120 

gctagagtga taagagctag ctaccttgca gatcgatctc cggccagcga ttgatcccat 180 

ccagtataat aatggcggcc atcacgctcc tacgttcagc gtctcttccg ggcctctccg 240 

acgccctcgc ccgggacgct gctgccgtcc aacatgtctg ctcctcctac ctgcccaaca 300 

acaaggagaa gaagagggag gtggatcctc tgctcgctca agtacgcctg ccttggcgtc 360 

gaccctgccc cgggcgagat tgcccggacc tcgccggtgt actccagcct caccgtcacc 420 

cctgctggag aggccgtcat ctcctcggag cagaaggtgt acgacgtcgt cctcaagcag 4 80 

gcagcattgc tcaaacgcca cctgcgccca caaccacaca ccattcccat cgttcccaag 540 

gacctggacc tgccaagaaa cggcctcaag caggcctatc atcgctgcgg agagatctgc 600 

gaggagtatg ccaagacctt ttaccttgga actatgctca tgacggagga ccgacggngc 660 

gccatatggg ccatctatgt gtggtgtagg agggcaaatg agcttgtaga tggaccaaat 720 

gcctcgcaca tcacaacgtc aagcctggac ggtggggaaa agaggt 7 66 

<210> 6 

<211> 164 

<212> PRT 

<213> Oryza sativa 

<220> 

<221> UNSURE 
<222> (129) 

<400> 6 

Met Ser Ala Pro Pro Thr Cys Pro Thr Thr Arg Arg Arg Arg Gly Arg 
15 10 15 



Trp lie Leu Cys Ser Leu Lys Tyr 

20 

Pro Gly Glu lie Ala Arg Thr Ser 

35 40 



Ala Cys Leu Gly Val Asp Pro Ala 
25 30 

Pro Val Tyr Ser Ser Leu Thr Val 

45 



Thr Pro Ala Gly Glu Ala Val He 
50 55 

Val Val Leu Lys Gin Ala Ala Leu 
65 70 

Pro His Thr He Pro He Val Pro 

85 

Gly Leu Lys Gin Ala Tyr His Arg 

100 



Ser Ser Glu Gin Lys Val Tyr Asp 

60 

Leu Lys Arg His Leu Arg Pro Gin 

75 80 

Lys Asp Leu Asp Leu Pro Arg Asn 
90 95 

Cys Gly Glu He Cys Glu Glu Tyr 
105 110 



Ala Lys Thr Phe 
115 

Xaa Ala He Trp 
130 

Val Asp Gly Pro 
145 



Tyr Leu Gly Thr 

120 

Ala He Tyr Val 
135 

Asn Ala Ser His 
150 



Met Leu Met Thr 



Trp Cys Arg Arg 

140 

He Thr Thr Ser 
155 



Glu Asp Arg Arg 
125 

Ala Asn Glu Leu 



Ser Leu Asp Gly 

160 



Gly Glu. Lys Arg 
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<210> 7 

<211> 476 

<212> DNA 

<213> Oryza sativa 

<220> 

<221> unsure 
<222> (2) 

<220> 

<221> unsure 
<222> (275) 

<220> 

<221> unsure 
<222> (453) 

<220> 

<221> unsure 
<222> (459) 

<400> 7 

cttacatgta agctcgtgcc gaattcngca cgagcttaca ccctaactct tcttacatta 60 
caccaaaggc acttgatcga tgggagaaga gattagaaga tctcttcgaa ggcaggccat 120 
atgatatgta tgatgcagcc ctctcggaca cagtgtcaaa gtttccagta gatatccagc 180 
cattcaaaga catgattgaa ggaatgaggc ttgacctgtg gaaatcaagg tataggagct 240 
ttgatgagct ctacctctac tgctactacg ttgctggcac ggttggtctc atgacagtac 300 
cggtgatggg gattgccccc gactcgaagg cctcaacccg agagcgtgta caacgctgcg 360 
ctagctnctt gggatcgcca acccagctga cgaaatattc tcaagangac gttaggccaa 4 20 
agaacccaag ggagggggaa agaatctaac ccntccaant ggggatgaaa ttggga 476 

<210> 8 

<211> 108 

<212> PRT 

<213> Oryza sativa 

<400> 8 

Pro Asn Ser Ser Tyr lie Thr Pro Lys Ala Leu Asp Arg Trp Glu Lys 
15 10 15 

Arg Leu Glu Asp Leu Phe Glu Gly Arg Pro Tyr Asp Met Tyr Asp Ala 

20 25 30 

Ala Leu Ser Asp Thr Val Ser Lys Phe Pro Val Asp lie Gin Pro Phe 

35 40 45 

Lys Asp Met lie Glu Gly Met Arg Leu Asp Leu Trp Lys Ser Arg Tyr 
50 55 60 

Arg Ser Phe Asp Glu Leu Tyr Leu Tyr Cys Tyr Tyr Val Ala Gly Thr 
65 70 75 80 

Val Gly Leu Met Thr Val Pro Val Met Gly lie Ala Pro Asp Ser Lys 

85 90 95 

Ala Gin Pro Glu Ser Val Tyr Asn Ala Ala Leu Ala 

100 105 



<210> 9 
<211> 1060 
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<212> DNA 

<213> Oryza sativa 



<220> 

<221> unsure 

<222> (2) 

<220> 

<221> unsure 

<222> (275) 



<400> 9 

gnacatcaca ccgtcagccc tgggaccggt 
gacgccceta cgacatgctt gatgctgcac 
atattcagcc tttcagggac atgatagaag 
acaagaactt cgacgagctc tacatgtact 
tgagtgttcc tgtgatgggt attgcacccg 
gtgctgcttt ggctctcggg aatgcaaacc 
aggacgcgag aagagggagg atatatttac 
ctgatgagga catcttcaat ggcgttgtga 
agatcaagag agctaggatg ttttttgagg 
aggcaagccg gtggccggtc tgggcgtctc 
tagaagcaaa cgattacaac aacttcacaa 
tgctagcgct tccagttgca tatggtagat 
gccagaagta ggaggcggga agaggagata 
ataggaaaaa tcagacagca tctgccttcc 
tgtatcatac atagcatgta tagggaaaat 
gttgaatatt tccttcacat catgtatgta 
atgtatgact ctgaagaaag agcaacctgt 
gggccgcaga ggtgagcaaa caaaaaaaaa 

<210> 10 

<211> 242 

<212> PRT 

<213> Oryza sativa 



gggagaagag gcttgatgat ctcttcaccg 60 

tttctgatac catctccaag tttcctatag 120 

ggatgcggtc agacctcaga aagactagat 180 

gctactatgt tgctggaact gtggggctaa 240 

agtcnaaggc aacaactgaa agtgtgtaca 300 

agctcacaaa tatactccgt gacgttggag 360 

cacaagatga acttgcagag gcaaggctct 420 

ctaacaaatg gagaagcttc atgaagagac 480 

aggcagagag aggggtgacc gagctcagcc 540 

tgttgttata ccggcaaatc cttgacgaga 600 

agagggcgta cgttgggaag gcgaagaaat 660 

cattgctgat gccctactca ctgagaaata 720 

aagggaagat gatgagcagg ttaggcttag 780 

gattaatgtt gaggaaatta tattattgtg 840 

gctgcaggca ggcaggcagg ctaggtgatg 900 

tatccttcct tgatgctaca gcacatatgt 960 

atagtagcta accggctatg gcctatgtat 1020 

aaaaaaaaaa 1060 



<400> 10 

Thr Ser His Arg Gin Pro Trp Asp Arg Trp Glu Lys Arg Leu Asp Asp 
1 5 10 15 

Leu Phe Thr Gly Arg Pro Tyr Asp Met Leu Asp Ala Ala Leu Ser Asp 

20 25 30 

Thr He Ser Lys Phe Pro He Asp He Gin Pro Phe Arg Asp Met He 
35 40 45 

Glu Gly Met Arg Ser Asp Leu Arg Lys Thr Arg Tyr Lys Asn Phe Asp 
50 55 60 

Glu Leu Tyr Met Tyr Cys Tyr Tyr Val Ala Gly Thr Val Gly Leu Met 
65 10 75 80 

Ser Val Pro Val Met Gly He Ala Pro Glu Ser Lys Ala Thr Thr Glu 

85 90 95 

Ser Val Tyr Ser Ala Ala Leu Ala Leu Gly Asn Ala Asn Gin Leu Thr 

100 105 HO 

Asn He Leu Arg Asp Val Gly Glu Asp Ala Arg Arg Gly Arg He Tyr 
115 120 125 
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Leu Pro Gin Asp Glu Leu Ala Glu 
130 135 

Phe Asn Gly Val Val Thr Asn Lys 
145 150 

He Lys Arg Ala Arg Met Phe Phe 

165 

Glu Leu Ser Gin Ala Ser Arg Trp 

180 

Tyr Arg Gin He Leu Asp Glu He 
195 200 



Ala Arg Leu Ser Asp Glu Asp He 

140 

Trp Arg Ser Phe Met Lys Arg Gin 
155 160 

Glu Glu Ala Glu Arg Gly Val Thr 
170 175 

Pro Val Trp Ala Ser Leu Leu Leu 
185 190 

Glu Ala Asn Asp Tyr Asn Asn Phe 

205 



Thr Lys Arg Ala Tyr Val Gly Lys Ala Lys Lys Leu Leu Ala Leu Pro 

210 215 220 

Val Ala Tyr Gly Arg Ser Leu Leu Met Pro Tyr Ser Leu Arg Asn Ser 

225 230 235 240 



Gin Lys 



<210> 11 

<211> 992 

<212> DNA 

<213> Glycine max 



<220> 

<221> unsure 

<222> (14) 

<220> 

<221> unsure 

<222> (23) 



<400> 11 

catttctatc gtgnatatgg 
caaaattgga agaacttttc 
atacagttgc caaattccct 
gactggatct taagaagcca 
atgttgctgg gacagttggt 
aagccacaac agagagtgta 
ccaacatact cagagatgtt 
atgagttggc tcaagcaggg 
agtggaggaa cttcatgaag 
aaaagggagt gacggagctt 
tgtatcgcca aatattggac 
cttatgtgag caaagccaag 
ttcctccatc aaaaaagtta 
tctgtagaaa aatggataag 
aaaacaaggc atgatattag 
ttacataaaa aaagtttgga 
atgaattatt tgaactgaaa 



ctnacatcga cctcaacgac 
caaggtcgtc catttgatat 
gttgatatcc agccatttaa 
agatacagaa actttgatga 
ataatgagtg ttccaatcat 
tacaatgctg ccttggccct 
ggagaggatg ccagcagagg 
ctttccgatg aagacatttt 
agccaaatta aaagggcaag 
aatgaagcta gcagatggcc 
gagatagaag ctaatgatta 
aagttacttt ctttgccagc 
tcttctgtaa tgaagacata 
gaggaccaca gaaaatggaa 
tcaatattgg attttgatat 
ctaatatttt gttactttag 
aaaaaaaaaa aa 



cactttgcct aggtgggaat 60 
gcttgatgct gctttatcag 120 
agatatgata gaaggaatga 180 
actatatctt tactgttact 240 
gggcatttca ccaaattccc 300 
aggcattgca aatcagctaa 360 
aagagtgtat cttccacaag 420 
tgctggtaag gtgacagaca 480 
aatgtttttt gatgaggcag 540 
tgtatgggcg tctttgctat 600 
caacaatttc actagaaggg 660 
tgcatatgct agatctatgg 720 
aatcgagcac cttatggcat 780 
aggcacaatt tgtatatgat 840 
tcatatttcc ccgtattttt 900 
agttaatttt gatgcgagtt 960 

992 



<210> 12 

<211> 252 

<212> PRT 

<213> Glycine max 
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<220> 

<221> UNSURE 
<222> (4) 

<400> 12 

Phe Leu Ser Xaa lie Trp Leu Thr Ser Thr Ser Thr Thr Thr Leu Pro 
15 10 15 

Arg Trp Glu Ser Lys Leu Glu Glu Leu Phe Gin Gly Arg Pro Phe Asp 

20 25 30 

Met Leu Asp Ala Ala Leu Ser Asp Thr Val Ala Lys Phe Pro Val Asp 
35 40 45 

lie Gin Pro Phe Lys Asp Met lie Glu Gly Met Arg Leu Asp Leu Lys 
50 55 60 

Lys Pro Arg Tyr Arg Asn Phe Asp Glu Leu Tyr Leu Tyr Cys Tyr Tyr 
65 70 75 80 

Val Ala Gly Thr Val Gly lie Met Ser Val Pro lie Met Gly He Ser 

85 90 95 

Pro Asn Ser Gin Ala Thr Thr Glu Ser Val Tyr Asn Ala Ala Leu Ala 

1 100 105 110 

* 

Leu Gly He Ala Asn Gin Leu Thr Asn lie Leu Arg Asp Val Gly Glu 
115 120 125 

Asp Ala Ser Arg Gly Arg Val Tyr Leu Pro Gin Asp Glu Leu Ala Gin 
130 135 140 

Ala Gly Leu Ser Asp Glu Asp He Phe Ala Gly Lys Val Thr Asp Lys 
145 150 155 160 

Trp Arg Asn Phe Met Lys Ser Gin He Lys Arg Ala Arg Met Phe Phe 

165 170 175 

Asp Glu Ala Glu Lys Gly Val Thr Glu Leu Asn Glu Ala Ser Arg Trp 

180 185 190 

Pro Val Trp Ala Ser Leu Leu Leu Tyr Arg Gin He Leu Asp Glu He 
195 200 205 

Glu Ala Asn Asp Tyr Asn Asn Phe Thr Arg Arg Ala Tyr Val Ser Lys 
210 215 220 

Ala Lys Lys Leu Leu Ser Leu Pro Ala Ala Tyr Ala Arg Ser Met Val 
225 230 235 240 

Pro Pro Ser Lys Lys Leu Ser Ser Val Met Lys Thr 

245 250 

<210> 13 

<211> 1397 

<212> DNA 

<213> Glycine max 

<400> 13 

gttttgctaa cacaagtata cactcattct caaaaggttt tcatccaatt tctttccctc 60 
tcttttcatt ggtgtgcact ttcacttgtg gagctgcatc aactgcagtg gaaattgtgc 120 
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tttgttcttg agatgtctgg tgttcttctt 
aactccttgg tgagtttttc atgcaggagt 
ttttctggaa tcagttttgc tagtggtact 
gagacttcaa gatcttcaga ggagagggtc 
gtaaaagaac acaaaagggg tacaaaaata 
gatttcaaca atgtggatct gttgaatgcg 
gagtatgcca agacatttta cttaggcaca 
atttgggcaa tttatgtgtg gtgcagaaga 
tcacacatca cccctggggc cttggacagg 
ggtcgaccct atgatatgta tgatgctgcc 
gatattcagc ccttcaagga catgatcgaa 
tacaataact ttgatgagct ctacctttac 
atgagtgtcc cagtaatggg gatagcacca 
aatgctgcat tggctctagg cattgcaaat 
gaagatgcta gaagaggaag agtatatctc 
tcagatgatg acattttccg cggaagagtt 
caaataaaga gggcgaggat gttttttgat 
tcagctagca ggtggcctgt gtgggcatca 
attgaagcca atgattataa taacttcaca 
ctcttgtcac tacctactgc ctatggtttt 
atggttagga ggtaactgtt atacaatgtg 
tcaagttaaa aaaaaaa 



tgggtgagtt gtggacccaa agagaacatc 180 
agtagtggtg gtgaaagaac acaaaagaga 240 
tctgcttttt caagtgcagt ggcagctact 300 
tatgaagtgg ttctgaagca agcagctttg 360 
gctttggatt tggacaaaga tgttgaggct 4 20 
gcttatgatc ggtgtggtga agtttgtgct 480 
caattgatga ctgcagagcg ccgaaaagca 540 
actgatgagc tagtggatgg cccaaatgct 600 
tgggagcaac gattgagtga tgtttttgaa 660 
ctctcacata ctgtctcaaa gtacccggtt 720 
gggatgaggg tggacctgag aaagtcaaga 780 
tgctactatg ttgctgggac agtaggcctt 840 
gaatcaaatg cttcatcaga gagcatttat 900 
caacttacca acatacttag agatgttgga 960 
ccacaagatg aattggcaca agctggcctt 1020 
acagacaaat ggcggaaatt catgaaggga 1080 
gaggcagaga gaggggttgc agagctcaac 1140 
ttgttgttgt ataggcaaat attagattcc 1200 
aaaagggcat atgtaggaaa agtaaagaaa 1260 
tcacttctag gccctcagaa gtttaccaaa 1320 
tgatactttt gagttacaac tgtatacatc 1380 

1397 



<210> 14 

<211> 400 

<212> PRT 

<213> Glycine max 



<400> 14 , * xi 

Met Ser Gly Val Leu Leu Trp Val Ser Cys Gly Pro Lys Glu Asn lie 
1 .5 10 15 

Asn Ser Leu Val Ser Phe Ser Cys Arg Ser Ser Ser Gly Gly Glu Arg 

20 25 30 

Thr Gin Lys Arg Phe Ser Gly lie Ser Phe Ala Ser Gly Thr Ser Ala 
35 40 45 

Phe Ser Ser Ala Val Ala Ala Thr Glu Thr Ser Arg Ser Ser Glu Glu 
50 55 60 

Arg Val Tyr Glu Val Val Leu Lys Gin Ala Ala Leu Val Lys Glu His 
65 70 *?5 80 

Lys Arg Gly Thr Lys He Ala Leu Asp Leu Asp Lys Asp Val Glu Ala 

85 90 95 

Asp Phe Asn Asn Val Asp Leu Leu Asn Ala Ala Tyr Asp Arg Cys Gly 

100 105 HO 

Glu Val Cys Ala Glu Tyr Ala Lys Thr Phe Tyr Leu Gly Thr Gin Leu 
115 120 125 

Met Thr Ala Glu Arg Arg Lys Ala lie Trp Ala He Tyr Val Trp Cys 
130 135 140 

Arg Arg Thr Asp Glu Leu Val Asp Gly Pro Asn Ala Ser His He Thr 
145 150 155 160 

Pro Gly Ala Leu Asp Arg Trp Glu Gin Arg Leu Ser Asp Val Phe Glu 

165 170 175 
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Gly Arg Pro Tyr Asp Met Tyr Asp Ala Ala Leu Ser His Thr Val Ser 

180 185 190 

Lys Tyr Pro Val Asp lie Gin Pro Phe Lys Asp Met lie Glu Gly Met 
195 200 205 

Arg Val Asp Leu Arg Lys Ser Arg Tyr Asn Asn Phe Asp Glu Leu Tyr 
210 215 220 

Leu Tyr Cys Tyr Tyr Val Ala Gly Thr Val Gly Leu Met Ser Val Pro 
225 230 235 240 

Val Met Gly lie Ala Pro Glu Ser Asn Ala Ser Ser Glu Ser lie Tyr 

245 250 255 

Asn Ala Ala Leu Ala Leu Gly lie Ala Asn Gin Leu Thr Asn lie Leu 

260 265 270 

Arg Asp Val Gly Glu Asp Ala Arg Arg Gly Arg Val Tyr Leu Pro Gin 
275 280 285 

Asp Glu Leu Ala Gin Ala Gly Leu Ser Asp Asp Asp lie Phe Arg Gly 
290 295 300 

Arg Val Thr Asp Lys Trp Arg Lys Phe Met Lys Gly Gin lie Lys Arg 
305 310 315 320 

Ala Arg Met Phe Phe Asp Glu Ala Glu Arg Gly Val Ala Glu Leu Asn 

325 330 335 

Ser Ala Ser Arg Trp Pro Val Trp Ala Ser Leu Leu Leu Tyr Arg Gin 

340 345 350 

He Leu Asp Ser He Glu Ala Asn Asp Tyr Asn Asn Phe Thr Lys Arg 
355 360 365 

Ala Tyr Val Gly Lys Val Lys Lys Leu Leu Ser Leu Pro Thr Ala Tyr 
370 375 380 

Gly Phe Ser Leu Leu Gly Pro Gin Lys Phe Thr Lys Met Val Arg Arg 
385 390 395 400 

<210> 15 
<211> 1021 
<212> DNA 

<213> Triticum aestivum 
<400> 15 

cggacgagga gaactgatga gctagtggat ggccctaact catcttacat cacgcccaag 60 

gcgctcgatc ggtgggagaa gagattagag gatctcttcg aaggccgccc atatgatatg 120 

tatgatgcag ccctctcaga tacagcgtca aagtttccaa ttgatatcca gccattcaga 180 

gacatgattg aagggatgag gctcgacctt tggaaatcga ggtataggac ctttgacgag 240 

ctctacctct actgctacta cgtcgctggc actgtcggtc tcatgacggt accggtgatg 300 

gggattgctc cggactcaaa ggcctcagca gagagcgtgt acaatgccgc actggccctt 360 

ggcattgcca accagctcac aaacatcctc cgagacgtag gagaagactc aagaaggggg 420 

agaatatacc ttccactgga cgaactggca caggcgggtc tgacagaaga ggacatattc 480 

agagggaaag tgacggataa atggaggagg ttcatgaagg ggcaaatcca gcgcgccagg 540 

ctcttctttg acgaggccga gaagggcgtc atgcatctag actccgcgag cagatggccg 600 

gtcctggcat cgctgtggct gtacaggcag atcctggacg ccatcgaggc caacgactac 660 

aacaacttca ccaagcgcgc gtacgtgggc aaggcaaaga agttcctgtc tctaccggcc 720 
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gcgtacgcga gggcggctct ctcgccatga 
tcttcttctt tttctttctt tttgtcctgt 
atatactcag ctatatgttt gccatacgcc 
tcgggccccg ctgtactgaa gtctgaaaca 
attgctccag ttgaatgaag aagaaacaaa 



gcaaagcaat cccgtagatc agatgttttt 780 
caccctacaa tgatttttgt tggctgttgt 840 
cgccgcggta tttaggtcaa gggaccgacg 900 
cttgttgtta ccacacagtg gagaatcaaa 960 
cactctttct tcctaaaaaa aaaaaaaaaa 1020 

1021 



<210> 16 

<211> 248 

<212> PRT 

<213> Triticum aestivura 



<400> 16 

Thr Arg Arg Thr Asp Glu Leu Val Asp Gly Pro Asn Ser Ser Tyr He 
15 10 15 

Thr Pro Lys Ala Leu Asp Arg Trp Glu Lys Arg Leu Glu Asp Leu Phe 

20 25 30 

Glu Gly Arg Pro Tyr Asp Met Tyr Asp Ala Ala Leu Ser Asp Thr Ala 
35 40 45 

Ser Lys Phe Pro He Asp He Gin Pro Phe Arg Asp Met He Glu Gly 
50 55 60 

Met Arg Leu Asp Leu Trp Lys Ser Arg Tyr Arg Thr Phe Asp Glu Leu 
65 70 75 80 

Tyr Leu Tyr Cys Tyr Tyr Val Ala Gly Thr Val Gly Leu Met Thr Val 

85 90 95 

Pro Val Met Gly He Ala Pro Asp Ser Lys Ala Ser Ala Glu Ser Val 

100 105 110 

Tyr Asn Ala Ala Leu Ala Leu Gly He Ala Asn Gin Leu Thr Asn He 
115 120 125 

Leu Arg Asp Val Gly Glu Asp Ser Arg Arg Gly Arg He Tyr Leu Pro 
130 135 140 

Leu Asp Glu Leu Ala Gin Ala Gly Leu Thr Glu Glu Asp He Phe Arg 
145 150 155 160 

Gly Lys Val Thr Asp Lys Trp Arg Arg Phe Met Lys Gly Gin He Gin 

165 170 175 

Arg Ala Arg Leu Phe Phe Asp Glu Ala Glu Lys Gly Val Met His Leu 

180 185 190 

Asp Ser Ala Ser Arg Trp Pro Val Leu Ala Ser Leu Trp Leu Tyr Arg 
195 200 205 

Gin He Leu Asp Ala He Glu Ala Asn Asp Tyr Asn Asn Phe Thr Lys 
210 215 220 



Arg Ala Tyr Val Gly Lys Ala Lys Lys Phe Leu Ser Leu Pro Ala Ala 
225 230 235 240 



Tyr Ala Arg Ala Ala Leu Ser Pro 

245 



12 



WO 99/55889 



PCT/US99/08789 



<210> 17 

<2U> 722 

<212> DNA 

<213> Zea mays 

<220> 

<221> unsure 

<222> (324) 

<220> 

<221> unsure 

<222> (525) 

<220> 

<221> unsure 

<222> (532) 

<220> 

<221> unsure 

<222> (534) 

<220> 

<221> unsure 

<222> (539) 

<220> 

<221> unsure 

<222> (554) 

<220> 

<221> unsure 
<222> (585) 

<220> 

<221> unsure 
<222> (613) 

<220> 

<221> unsure 
<222> (635) 

<220> 

<221> unsure 
<222> (642) 

<220> 

<221> unsure 
<222> (645) 

<220> 

<221> unsure 
<222> (651) 

<220> 

<221> unsure 
<222> (669) 

<220> 

<221> unsure 
<222> (675) . . (676) 
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<220> 

<221> unsure 
<222> (719) 

<400> 17 

gccgtcgacg ccgccgcggc cgacgaggtc atggacgccg gctgcgtcac gggggaccgc 60 
gtcaacggca tcgttgacgg cgtttctggc tcctggtaca tcaagtttga tacgtttact 120 
cctgcagctg agcgggggct cccggtcaca agggtcatta gccgcatgac gctgcaacag 180 
atccttgctc gagcagttgg cgatgacgct atattgaatg gaagccatgt agtcgatttt 240 
acagatgatg gcagtaaggt tactgccata ttggaggacg gtaggatatt tgaaggtgac 300 
cttttggttg gtgccgatgg aatntggtca aaggtgagga agacactatt cgggcactca 360 
gatgccacct attcaggtta catctgcaat tccagtgtag cagattttgt gccacctgat 420 
atcgatacag ttgggtaccg agtatttctt ggccacaaac agtacttcgt ctcttcggat 480 
gtcggtgctg gtaaaatgca atggtacgct tttcacaatg aagangctgg tngnactgnc 540 
cctgaaatgg caanaaagaa aaaattgctt gagatattcg acggntgggt ggataatgtt 600 
aatgatttga tanatgcaac tgaggaagaa gcagntcttc gncgngatat ntacggcggc 660 
ccacctaanc gatgnnattg gggggaaagg ccgggcacct tgcttgggga tctggccang 720 
ct 7 22 

<210> 18 

<211> 121 

<212> PRT 

<213> Zea mays 

<220> 

<221> UNSURE 
<222> (95) 

<400> 18 

Gly Cys Val Thr Gly Asp Arg Val Asn Gly lie Val Asp Gly Val Ser 
1 5 10 15 

Gly Ser Trp Tyr lie Lys Phe Asp Thr Phe Thr Pro Ala Ala Glu Arg 

20 25 30 

Gly Leu Pro Val Thr Arg Val lie Ser Arg Met Thr Leu Gin Gin lie 
35 40 45 

Leu Ala Arg. Ala Val Gly Asp Asp Ala lie Leu Asn Gly Ser His Val 
50 55 60 

Val Asp Phe Thr Asp Asp Gly Ser Lys Val Thr Ala lie Leu Glu Asp 
65 70 75 80 

Gly Arg lie Phe Glu Gly Asp Leu Leu Val Gly Ala Asp Gly Xaa Trp 

85 90 95 

Ser Lys Val Arg Lys Thr Leu Phe Gly His Ser Asp Ala Thr Tyr Ser 

100 105 110 

Gly Tyr He Cys Asn Ser Ser Val Ala 

120 





115 


<210> 


19 


<211> 


1246 


<212> 


DNA 


<213> 


Zea mays 


<220> 




<221> 


unsure 


<222> 


(367) 
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<400> 19 

aagaaagagg agctcggaca angcagagcg ccatcgttcg gtttccttgc tgaattcccg 60 

atcgctcgct cgctcgaaaa gaaagaagct agcttttagc atggctattg aggatggtta 120 

ccagctggct gtagagctag agaatgcctg gcaagagagt gtcaaaactg aaactcctat 180 

agacatagtt tcctccttga ggcgctacga gaaagagaga aggctgcgtg ttgctattat 240 

acatggactg gcaagaatgg cagcaatcat ggctaccacc tatagaccgt acttgggtgt 300 

tggtctaggg cctttatcgt ttttgaccaa gttgcggata ccacaccctg gaagagtcgg 360 

tggcagnttc ttcatcaagt atggaatgcc tacgatgttg agctgggtgc ttggtggcaa 420 

cagctcaaaa ctagaaggaa gacttttaag ctgccgactt tctgacaagg caaatgacca 480 

gctttatcaa tggtttgagg atgatgacgc actggaagaa gctatgggtg gagaatggta 540 

cctcatcgca acaagtgaag gaaactgcaa tagcttgcag cccattcatt taattaggga 600 

tgagcagagg tcactctttg ttggaagccg gtcagatcct aatgattcag cttcttccct 660 

atcattgtcc tctccacaga tatcagaaag acatgctact atcacatgca agaataaagc 720 

tttctatctg actgatctcg gaagcgaaca tggtacctgg attaccgaca atgaaggtag 780 

acgttaccgc gtgccaccaa acttcccagt tcgtttccat ccctccgatg tcattgagtt 840 

tggttccgat aagaaggcta tgttccgggt gaaggtgctg aacacgctcc cgtatgaatc 900 

tgcaagaagt gggaatcggc agcaacagca agtccttcag gcagcatgaa tggagacact 960 

ggctaccacc actatcatca gccacactgt actgtacagc atccggtaaa gacacaacac 1020 

tgcatcacgg aaaggataca ctcgttctcg aatatttgtc gtctgctagt tcaattttaa 1080 

actaaaacgt gacaaatgaa aaaacgaagg aagtagaaga tatgtcaaaa cacatgcaat 1140 

ttttgcatcc atgaagatgc caaacaggat cttgaatact agcacctagc ggattgaaat 1200 

aatgaagttg cagttctgcg tgaactggat tgtacgatag ggatag 124 6 

<210> 20 1 

<211> 315 

<212> PRT 

<213> Zea mays 

<220> 

<221> UNSURE 
<222> (7) 

<220> 

<221> UNSURE 
<222> (122) 

<400> 20 

Arg Lys Arg Ser Ser Asp Xaa Ala Glu Arg His Arg Ser Val Ser Leu 
1 5 10 15 

Leu Asn Ser Arg Ser Leu Ala Arg Ser Lys Arg Lys Lys Leu Ala Phe 

20 25 30 

Ser Met Ala lie Glu Asp Gly Tyr Gin Leu Ala Val Glu Leu Glu Asn 
35 40 45 

Ala Trp Gin Glu Ser Val Lys Thr Glu Thr Pro lie Asp lie Val Ser 
50 55 60 

Ser Leu Arg Arg Tyr Glu Lys Glu Arg Arg Leu Arg Val Ala lie lie 
65 70 75 80 

His Gly Leu Ala Arg Met Ala Ala lie Met Ala Thr Thr Tyr Arg Pro 

85 90 95 

Tyr Leu Gly Val Gly Leu Gly Pro Leu Ser Phe Leu Thr Lys Leu Arg 

100 105 110 

lie Pro His Pro Gly Arg Val Gly Gly Xaa Phe Phe lie Lys Tyr Gly 
115 120 125 
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Met Pro Thr Met 
130 

Glu Gly Arg Leu 
145 

Leu Tyr Gin Trp 



Gly Glu Trp Tyr 

180 

Gin Pro lie His 
195 

Ser Arg Ser Asp 
210 

Pro Gin lie Ser 
225 

Phe Tyr Leu Thr 



Leu Ser Trp Val 
135 

Leu Ser Cys Arg 
150 

Phe Glu Asp Asp 
165 

Leu lie Ala Thr 



Leu lie Arg Asp 

200 

Pro Asn Asp Ser 
215 

Glu Arg His Ala 
230 

Asp Leu Gly Ser 
245 



Leu Gly Gly Asn 

140 

Leu Ser Asp Lys 
155 

Asp Ala Leu Glu 
170 

Ser Glu Gly Asn 
185 

Glu Gin Arg Ser 



Ala Ser Ser Leu 

220 

Thr He Thr Cys 
235 

Glu His Gly Thr 
250 



Ser Ser Lys Leu 



Ala Asn Asp Gin 

160 

Glu Ala Met Gly 
175 

Cys Asn Ser Leu 
190 

Leu Phe Val Gly 
205 

Ser Leu Ser Ser 



Lys Asn Lys Ala 

240 

Trp He Thr Asp 
255 



Asn Glu Gly Arg Arg Tyr Arg Val 

260 

His Pro Ser Asp Val He Glu Phe 
275 280 

Arg Val Lys Val Leu Asn Thr Leu 
290 295 

Asn Arg Gin Gin Gin Gin Val Leu 
305 310 



Pro Pro Asn Phe Pro Val Arg Phe 
265 27 0 

Gly Ser Asp Lys Lys Ala Met Phe 

285 

Pro Tyr Glu Ser Ala Arg Ser Gly 

300 

Gin Ala Ala 
315 



<210> 21 

<211> 926 

<212> DNA 

<213> Glycine max 



<400> 21 

gcacgagcat gatggtgata ttttaatagg agcagatgga atatggtcag aagtgcgttc 60 
aaaactcttt gggcagcaag aagcaaatta ctcgggtttc acatgctaca gtggattaac 120 
aagctatgtg cccccatata ttgataccgt tgggtatcgg gtgttcttgg gcttgaacca 180 
gtactttgtt gcttcagatg ttggccatgg gaagatgcag tggtatgctt tccatgggga 240 
acccccttca agtgaccctt tcccagaagg taagaagaag aggcttttgg atctctttgg 300 
taattggtgc gatgaagtga ttgcactcat atcagaaaca ccagaacata tgattataca 360 
gagggatata tatgacagag acatgatcaa cacttgggga attgggagag tgactttgtt 420 
aggtgatgca gcacatccaa tgcaaccaaa tcttggtcaa ggagggtgta tggcaataga 480 
ggattgttac caactgatac ttgagctaga caaggttgct aaacatggct ctgacgggtc 540 
tgaagttatc tcagctctta gaagatatga gaagaaaaga atcccccgag ttagggtgtt 600 
acacacagct agcaggatgg catcgcaaat gttagtcaac taccggcctt atattgaatt 660 
taaattttgg cctctatcaa atgtaacaac tatgcagata aagcaccctg gcattcatgt 720 
agctcaagcc cttttcaagt tcacttttcc acaatttgtt acttggatga ttgctggcca 780 
tgggttgtgg tgaacactca tgcaacttga aaataaaaag ggctcaacaa ttttaacatg 840 
atggtagtta aaagttaatt ttattgggct atgtaggaac ttttctttcg gaataaacgt 900 
gccataattt aaaaaaaaaa aaaaaa 926 
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<210> 22 

<211> 263 

<212> PRT 

<213> Glycine max 

<400> 22 

His Glu His Asp Giy Asp He Leu He Gly Ala Asp Gly He Trp Ser 
1 5 10 15 

Glu Val Arg Ser Lys Leu Phe Gly Gin Gin Glu Ala Asn Tyr Ser Gly 

20 25 30 

Phe Thr Cys Tyr Ser Gly Leu Thr Ser Tyr Val Pro Pro Tyr He Asp 
35 40 45 

Thr Val Gly Tyr Arg Val Phe Leu Gly Leu Asn Gin Tyr Phe Val Ala 
50 55 60 

Ser Asp Val Gly His Gly Lys Met Gin Trp Tyr Ala Phe His Gly Glu 
65 70 75 80 

Pro Pro Ser Ser Asp Pro Phe Pro Glu Gly Lys Lys Lys Arg Leu Leu 

85 90 95 

Asp Leu Phe Gly Asn Trp Cys Asp Glu Val He Ala Leu He Ser Glu 

100 105 HO 

Thr Pro Glu His Met He He Gin Arg Asp He Tyr Asp Arg Asp Met 
115 120 125 

He Asn Thr Trp Gly He Gly Arg Val Thr Leu Leu Gly Asp Ala Ala 
130 135 140 

His Pro Met Gin Pro Asn Leu Gly Gin Gly Gly Cys Met Ala He Glu 
145 150 155 160 

Asp Cys Tyr Gin Leu He Leu Glu Leu Asp Lys Val Ala Lys His Gly 

165 170 175 

Ser Asp Gly Ser Glu Val He Ser Ala Leu Arg Arg Tyr Glu Lys Lys 

180 185 190 

Arg He Pro Arg Val Arg Val Leu His Thr Ala Ser Arg Met Ala Ser 
195 200 205 

Gin Met Leu Val Asn Tyr Arg Pro Tyr He Glu Phe Lys Phe Trp Pro 
210 215 220 

Leu Ser Asn Val Thr Thr Met Gin He Lys His Pro Gly He His Val 
225 230 235 240 

Ala Gin Ala Leu Phe Lys Phe Thr Phe Pro Gin Phe Val Thr Trp Met 

245 250 255 

He Ala Gly His Gly Leu Trp 

260 

<210> 23 

<211> 1528 

<212> DNA 

<213> Glycine max 
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<400> 23 

cacaaaacac acacacacat attctcacac aaactgcaac catggctact accttatgtt 60 
acaattctct taacccttca acaaccgttt tctcaagaac ccatttctca gttcccttga 120 
ataaagagct tccactggat gcttcacctt ttgttgttgg ctataactgt ggtgtaggat 180 
gcagaacaag gaagcaaagg aagaaagtga tgcatgtgaa gtgtgcagtg gtggaggctc 240 
caccaggtgt ttcaccctca gcaaaagatg ggaatgggaa ccaccccttc cgaagaagca 300 
gcttcgtata cttgtggctg gtggagggat tggagggttg gtttttgctt tgggctgcaa 360 
agagaaaggg gtttgaggtg atggtgtttg agaaggactt gagtgctata agaggggagg 420 
gacagtatag gggtccaatt cagattcaga gcaatgcttt ggctgctttg gaagctattg 480 
attcagaggt tgctgatgaa gttatgagag ttggttgcat cactggtgat agaatcaatg 540 
gacttgtaga tggggtttct ggttcttggt acgtcaagtt tgatacattc actcctgcag 600 
tggaacgtgg gcttcctgtc acaagagtta ttagtcgaat ggttttacaa gagatccttg 660 
ctcgcgcagt tggggaagat atcattatga atgccagtaa tgttgttaat tttgtggatg 720 
atggaaacaa ggtaacagta gagctagaga atggtcagaa atatgaagga gatgtcttgg 780 
ttggagcgga tggaatatgg tccaaggtga ggaagcagtt atttgggctc acagaagctg 840 
tttactctgg ttatacttgt tatactggca ttgcagattt tgtgcctgct gacattgaaa 900 
ctgttggata ccgagtattc ttgggacaca aacaatactt tgtatcttca gatgttggtg 960 
cgggaaagat gcaatggtat gcatttcaca aagaaactcc cggtggggtt gatgagccca 1020 
acggaaaaaa ggaaaggttg cttaggatat ttgagggctg gtgtgaaagt gctgtagatc 1080 
tgatacttgc cacagaagaa gaagcaattc taagacgaga catatatgac aggataccaa 1140 
cattgacatg gggaaagggt cgcgtgactt tgcttggtga ttccgtccat gccatgcagc 1200 
caaatatggg ccaaggaggg tgcatggcta ttgaggacag ttatcaactt gcatgggagt 1260 
tggagaatgc atgggaacaa agtattaaat cagggagtcc aattgacatt gattcttccc 1320 
taaggagcta cgagagagaa agaagactac gagttgccat tattcatgga atggctagaa 1380 
tggcggctct catggcttcc acttacaagg catatctggg tgttggtctt ggccctttag 1440 
aatttttgac taagtttcgt ataccacatc ctggaagagt tggaggaagg ttttttgttg 1500 
acatcatgat gccttctatg ttgatgtt 1528 

<210> 24 

<211> 495 

<212> PRT 

<213> Glycine max 

<400> 24 

Met Ala Thr Thr Leu Cys Tyr Asn Ser Leu Asn Pro Ser Thr Thr Val 
15 10 15 

Phe Ser Arg Thr His Phe Ser Val Pro Leu Asn Lys Glu Leu Pro Leu 

20 25 30 

Asp Ala Ser Pro Phe Val Val Gly Tyr Asn Cys Gly Val Gly Cys Arg 
35 40 45 

* 

Thr Arg Lys Gin Arg Lys Lys Val Met His Val Lys Cys Ala Val Val 
50 55 60 

Glu Ala Pro Pro Gly Val Ser Pro Ser Ala Lys Asp Gly Asn Gly Asn 
65 70 75 80 

His Pro Phe Arg Arg* Ser Ser Phe Val Tyr Leu Trp Leu Val Glu Gly 

85 90 95 

Leu Glu Gly Trp Phe Leu Leu Trp Ala Ala Lys Arg Lys Gly Phe Glu 

100 105 110 

Val Met Val Phe Glu Lys Asp Leu Ser Ala lie Arg Gly Glu Gly Gin 
115 120 125 

Tyr Arg Gly Pro lie Gin He Gin Ser Asn Ala Leu Ala Ala Leu Glu 
130 135 140 
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Ala lie Asp Ser 
145 

Thr Gly Asp Arg 



Tyr Val Lys Phe 

180 

Val Thr Arg Val 
195 

Ala Val Gly Glu 
210 

Val Asp Asp Gly 
225 

Tyr Glu Gly Asp 



Arg Lys Gin Leu 

260 

Cys Tyr Thr Gly 
275 

Gly Tyr Arg Val 
290 

Val Gly Ala Gly 
305 

Gly Gly Val Asp 



Phe Glu Gly Trp 

340 

Glu Glu Ala He 
355 

Thr Trp Gly Lys 
370 

Met Gin Pro Asn 
385 

Tyr Gin Leu Ala 



Glu Val Ala Asp 
150 

He Asn Gly Leu 
165 

Asp Thr Phe Thr 



He Ser Arg Met 

200 

Asp He He Met 
215 

Asn Lys Val Thr 
230 

Val Leu Val Gly 
245 

Phe Gly Leu Thr 



He Ala Asp Phe 

280 

Phe Leu Gly His 
295 

Lys Met Gin Trp 
310 

Glu Pro Asn Gly 
325 

Cys Glu Ser Ala 



Leu Arg Arg Asp 

360 

Gly Arg Val Thr 
375 

Met Gly Gin Gly 
390 

Trp Glu Leu Glu 
405 



Glu Val Met Arg 
155 

Val Asp Gly Val 
170 

Pro Ala Val Glu 
185 

Val Leu Gin Glu 



Asn Ala Ser Asn 

220 

Val Glu Leu Glu 
235 

Ala Asp Gly He 
250 

Glu Ala Val Tyr 
265 

Val Pro Ala Asp 



Lys Gin Tyr Phe 

300 

Tyr Ala Phe His 
315 

Lys Lys Glu Arg 
330 

Val Asp Leu He 
345 

He Tyr Asp Arg 



Leu Leu Gly Asp 

380 

Gly Cys Met Ala 
395 

Asn Ala Trp Glu 
410 



Val Gly Cys He 

160 

Ser Gly Ser Trp 
175 

Arg Gly Leu Pro 
190 

He Leu Ala Arg 
205 

Val Val Asn Phe 



Asn Gly Gin Lys 

240 

Trp Ser Lys Val 
255 

Ser Gly Tyr Thr 
270 

He Glu Thr Val 
285 

Val Ser Ser Asp 



Lys Glu Thr Pro 

320 

Leu Leu Arg He 
335 

Leu Ala Thr Glu 
350 

He Pro Thr Leu 
365 

Ser Val His Ala 



He Glu Asp Ser 

400 

Gin Ser He Lys 



Ser Gly Ser Pro 

420 

Glu Arg Arg Leu 
435 



He Asp He Asp 

Arg Val Ala He 

440 



Ser Ser Leu Arg 
425 

He His Gly Met 



Ser Tyr Glu Arg 
430 

Ala Arg Met Ala 
445 



Ala Leu Met Ala Ser Thr Tyr Lys Ala Tyr Leu Gly Val Gly Leu Gly 
450 455 460 
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Pro Leu Glu Phe Leu Thr Lys Phe Arg He Pro His Pro Gly Arg Val 
465 470 475 480 

Glv Glv Arq Phe Phe Val Asp He Met Met Pro Ser Met Leu Met 
y 485 490 495 

<210> 25 

<211> 686 

<212> DNA 

<213> Glycine max 

<400> 25 

aacaagatgg aacaggtctt tcaaagccta tatctttaag tcgaaatgag atgaaaccct 60 
tcataatcgg gagtgcacca atgcaagata attcaggcag ttcagttaca atttcttcac 120 
cacaggtttc tccaacgcat gctcgaatta actataagga tggtgccttc ttcttgattg 180 
atttacggag tgagcatggc acctggatca ttgacaacga aggaaagcag taccgggtac 240 
ctcctaatta tcctgctcgc atccgtccat ctgatgttat tcagtttggt tctgagaagg 300 
tttcgttccg tgttaaggtg acaagctctg ttccaagagt ctcagaaaat gaaagcacac 360 
tagctttgca gggagtatga ctgattctgc tcaattgcaa tttgtaagtt atggaaaaat 420 
tatacagcac aaatttgcta ttgtatagta ctatctgcat tgttttaggg tggggtatta 4 80 
taccacagtc tagtcattta agatctgata tgttacatgc ctatatggac atttaagagg 540 
gactcttggg tataaatttg ttactccact ccaatacttt ttgtgtatga catttgtaat 600 
ttgttagagt tagatttata acatgacaca cataaacttg cacgtgatta aaaaaaaaaa 660. 
aaaaaaaaaa aaaaaaaaaa aaaaaa 686 

<210> 26 

<211> 125 

<212> PRT 

<213> Glycine max 

<400> 26 

Gin Asp Gly Thr Gly Leu Ser Lys Pro He Ser Leu Ser Arg Asn Glu 
15 10 15 

Met Lys Pro Phe He He Gly Ser Ala Pro Met Gin Asp Asn Ser Gly 

20 25 30 

Ser Ser Val Thr He Ser Ser Pro Gin Val Ser Pro Thr His Ala Arg 
35 40 45 

He Asn Tyr Lys Asp Gly Ala Phe Phe Leu He Asp Leu Arg Ser Glu 
50 55 60 

His Gly Thr Trp He He Asp Asn Glu Gly Lys Gin Tyr Arg Val Pro 
65 70 75 80 

Pro Asn Tyr Pro Ala Arg He Arg Pro Ser Asp Val He Gin Phe Gly 

85 90 95 

Ser Glu Lys Val Ser Phe Arg Val Lys Val Thr Ser Ser Val Pro Arg 

100 105 HO 

, Val Ser Glu Asn Glu Ser Thr Leu Ala Leu Gin Gly Val 

115 120 125 

<210> 27 

<211> 310 

<212> PRT 

<213> Lycopersicon esculentum 
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<400> 27 

Asp Pro Asp lie Val Leu Pro Gly Asn Leu Gly Leu Leu Ser Glu Ala 
1 5 10 15 

Tyr Asp Arg Cys Gly Glu Val Cys Ala Glu Tyr Ala Lys Thr Phe Tyr 

20 25 30 

Leu Gly Thr Met Leu Met Thr Pro Asp Arg Arg Arg Ala lie Trp Ala 

35 40 45 

lie Tyr Val Trp Cys Arg Arg Thr Asp Glu Leu Val Asp Gly Pro Asn 
50 55 60 

Ala Ser His He Thr Pro Gin Ala Leu Asp Arg Trp Glu Ala Arg Leu 
65 70 75 80 

Glu Asp He Phe Asn Gly Arg Pro Phe Asp Met Leu Asp Ala Ala Leu 

85 90 95 

Ser Asp Thr Val Ser Arg Phe Pro Val Asp He Gin Pro Phe Arg Asp 

100 105 110 

Met Val Glu Gly Met Arg Met Asp Leu Trp Lys Ser Arg Tyr Asn Asn 
115 120 125 

Phe Asp Glu Leu Tyr Leu Tyr Cys Tyr Tyr Val Ala Gly Thr Val Gly 
130 135 140 

Leu Met Ser Val Pro He Met Gly He Ala Pro Glu Ser Lys Ala Thr 
145 150 155 160 

Thr Glu Ser Val Tyr Asn Ala Ala Leu Ala Leu Gly He Ala Asn Gin 

165 170 175 

Leu Thr Asn He Leu Arg Asp Val Gly Glu Asp Ala Arg Arg Gly Arg 

180 185 190 

Val Tyr Leu Pro Gin Asp Glu Leu Ala Gin Ala Gly Leu Ser Asp Glu 
195 200 205 

Asp He Phe Ala Gly Lys Val Thr Asp Lys Trp Arg He Phe Met Lys 
210 215 220 

Lys Gin He Gin Arg Ala Arg Lys Phe Phe Asp Glu Ala Glu Lys Gly 
225 230 235 240 

Val Thr Glu Leu Ser Ser Ala Ser Arg Trp Pro Val Leu Ala Ser Leu 

245 250 255 

Leu Leu Tyr Arg Lys He Leu Asp Glu He Glu Ala Asn Asp Tyr Asrt 

260 265 270 

Asn Phe Thr Arg Arg Ala Tyr Val Ser Lys Pro Lys Lys Leu Leu Thr 
275 280 285 

Leu Pro He Ala Tyr Ala Arg Ser Leu Val Pro Pro Lys Ser Thr Ser 
290 295 300 



Cys Pro Leu Ala Lys Thr 
305 310 
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<210> 28 

<211> 410 

<212> PRT 

<213> Zea mays 

<400> 28 

Met Ala He He Leu Val Arg Ala Ala Ser Pro Gly Leu Ser Ala Ala 
1 5 10 15 

Asp Ser He Ser His Gin Gly Thr Leu Gin Cys Ser Thr Leu Leu Lys 

20 25 30 

Thr Lys Arg Pro Ala Ala Arg Arg Trp Met Pro Cys Ser Leu Leu Gly 
35 40 45 

Leu His Pro Trp Glu Ala Gly Arg Pro Ser Pro Ala Val Tyr Ser Ser 
50 . 55 60 

Leu Pro Val Asn Pro Ala Gly Glu Ala Val Val Ser Ser Glu Gin Lys 
65 70 75 80 

Val Tyr Asp Val Val Leu Lys Gin Ala Ala Leu Leu Lys Arg Gin Leu 

85 90 95 

Arg Thr Pro Val Leu Asp Ala Arg Pro Gin Asp Met Asp Met Pro Arg 

100 105 HO 

Asn Gly Leu Lys Glu Ala Tyr Asp Arg Cys Gly Glu He Cys Glu Glu 
115 120 125 

Tyr Ala Lys Thr Phe Tyr Leu Gly Thr Met Leu Met Thr Glu Glu Arg 
130 135 140 

Arg Arg Ala He Trp Ala He Tyr Val Trp Cys Arg Arg Thr Asp Glu 
145 150 155 160 

Leu Val Asp Gly Pro Asn Ala Asn Tyr He Thr Pro Thr Ala Leu Asp 

165 170 175 

Arg Trp Glu Lys Arg Leu Glu Asp Leu Phe Thr Gly Arg Pro Tyr Asp 

180 185 190 

Met Leu Asp Ala Ala Leu Ser Asp Thr He Ser Arg Phe Pro He Asp 
195 200 205 

He Gin Pro Phe Arg Asp Met He Glu Gly Met Arg Ser Asp Leu Arg 
210 215 220 

Lys Thr Arg Tyr Asn Asn Phe Asp Glu Leu Tyr Met Tyr Cys Tyr Tyr 
225 230 235 240 

Val Ala Gly Thr Val Gly Leu Met Ser Val Pro Val Met Gly He Ala 

245 250 255 

Thr Glu Ser Lys Ala Thr Thr Glu Ser Val Tyr Ser Ala Ala Leu Ala 

260 265 270 

Leu Gly lie Ala Asn Gin Leu Thr Asn He Leu Arg Asp Val Gly Glu 
275 280 285 
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Asp Ala Arg Arg Gly Arg He Tyr Leu 
290 295 

Ala Gly Leu Ser Asp Glu Asp He Phe 
305 310 

Trp Arg Asn Phe Met Lys Arg Gin He 

325 

Glu Glu Ala Glu Arg Gly Val Asn Glu 

340 345 

Pro Val Trp Ala Ser Leu Leu Leu Tyr 
355 360 

Glu Ala Asn Asp Tyr Asn Asn Phe Thr 
370 375 

Gly Lys Lys Leu Leu Ala Leu Pro Val 
385 390 

Leu Pro Cys Ser Leu Arg Asn Gly Gin 

405 
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Pro Gin Asp Glu Leu Ala Gin 
300 

Lys Gly Val Val Thr Asn Arg 
315 320 

Lys Arg Ala Arg Met Phe Phe 
330 335 

Leu Ser Gin Ala Ser Arg Trp 

350 

Arg Gin He Leu Asp Glu He 

365 

Lys Arg Ala Tyr Val Gly Lys 
380 

Ala Tyr Gly Lys Ser Leu Leu 
395 400 

Thr 
410 
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SEQ IO «Ot27 
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because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a) 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1-5,11-16 partially 

Isolation of gene sequences representing a maize-specific 
cDNA encoding Phytoene Synthase; namely SEQIOs 1, 
furthermore the corresponding deduced amino acid sequences 
SEQIDs 2. 



2. Claims: 1-5,11-16 partially 

Isolation of gene sequences representing a maize-specific 
cDNA encoding Phytoene Synthase; namely SEQIDs 3, 
furthermore the corresponding deduced amino acid sequences 
SEQIDs 4. ' 



3. Claims: 1-5,11-16 partially 

Isolation of gene sequences representing a rice-specific 
cDNA encoding Phytoene Synthase; namely SEQIDs 5, 
furthermore the corresponding deduced amino acid sequences 
SEQIDs 6. 



4. Claims: 1-5,11-16 partially 

Isolation of gene sequences representing a rice-specific 
cDNA encoding Phytoene Synthase; namely SEQIDs 7, 
furthermore the corresponding deduced amino acid sequences 
SEQIDs 8. 



5. Claims: 1-5,11-16 partially 

Isolation of gene sequences representing a rice-specific 
cDNA encoding Phytoene Synthase; namely SEQIDs 9, 
furthermore the corresponding deduced amino acid sequences 
SEQIDs 10. 



6. Claims: 1-5,11-16 partially 

Isolation of gene sequences representing soybean-specific 
cDNAs encoding Phytoene Synthase; namely SEQIDs 11 and 13, 
furthermore the corresponding deduced amino acid sequences 
SEQIDs 12 and 14. 



7. Claims: 1-5,11-16 partially 

Isolation of gene sequences representing a wheat-specific 
cDNA encoding Phytoene Synthase; namely SEQIDs 15, 
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furthermore the corresponding deduced amino acid sequences 
SEQIDs 16. 



8. Claims: 6-10,11-16 partially 

Isolation of gene sequences representing maize-specific 
cONAs encoding Zeaxanthin Epoxidase; namely SEQIDs 17 and 
19, furthermore the corresponding deduced amino acid 
sequences SEQIDs 18 and 20. 



9. Claims: 6-10,11-16 partially 

Isolation of gene sequences representing soybean- specific 
cDNAs encoding Zeaxanthin Epoxidase; namely SEQIDs 21,23 and 
25, furthermore the corresponding deduced amino acid 
sequences SEQIDs 22,24 and 26. 
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